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The  limitations  of  traditional  database  management  systems  (DBMS)  in  supporting 
advanced  application  domains  such  as  Computer-Aided  Design/Manufacturing 
(CAD/CAM),  Computer-Aided  Software  Engineering  (CASE)  and  Geographical 
Information  Systems  (GIS),  have  motivated  research  on  the  so-called  next-generation 
DBMSs,  including  extensible  object-oriented  database  management  systems  (OODBMSs), 
active  OODBMSs,  and  object-oriented  database  programming  languages  (OODBPLs). 
Despite  their  success,  these  systems  have  two  limitations:  their  object  models  (i)  are  too 
simple,  and  (ii)  have  a  fixed  set  of  modeling  constructs.  Most  of  these  systems  use  the 
model  of  an  00  programming  language  (e.g.,  C++)  as  their  object  model.  The  structural 
properties  of  objects  are  defined  by  means  of  generalization  and  aggregation  associations. 
Their  behavioral  properties  are  defined  by  method  specifications   (or  signatures).  All 
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operational  semantics  (i.e.,  control  and  logic)  are  implemented  and  thus  hidden  in  the 
method  implementations.  In  recent  work  on  active  DBMSs,  event-condition-action  (ECA) 
rules  have  been  proposed  and  used  in  these  systems  as  an  abstraction  to  represent  some 
of  the  operational  semantics  associated  with  objects.  However,  if  the  same  types  of 
operational  semantics  are  associated  with  many  types  of  objects,  then  many  ECA  rules 
will  have  to  be  repeatedly  vwitten.  On  the  other  hand,  if  the  data  model  of  the  DBMS 
or  DBPL  is  extensible,  then  new  modeling  constructs  such  as  new  class  or  association 
types  and  new  constraint  types  specified  by  keywords  can  be  introduced  and  used  to 
model  a  database.  The  names  and  keywords  of  these  new  types  can  be  used  instead  of 
defining  them  explicitly  with  rules  and  method  implementations.  We  have  designed  and 
implemented  an  extensible  knowledge  base  programming  language  (KBPL)  called  K.3. 
The  features  of  K.3  include  (i)  an  extensible  object  model,  OSAM*/X,  which  can  be 
extended  with  new  constraint  types,  association  types  and  class  types,  and  (ii)  an 
extensible  class  specification  construct  which  allows  the  addition  of  new  model  extensions 
without  requiring  the  redesign  or  modification  of  the  compiler.  These  features  allow  the 
language  and  its  underlying  object  model  to  be  tailored  to  meet  the  diverse  programming 
and  data  modeling  requirements  of  different  application  domains. 
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CHAPTER  1 
INTRODUCTION 

1.1  Motivation 
1.1.1  Current  Trends  in  DBMS  Technology 

The  limitations  of  traditional  database  management  systems  (DBMS)  in  supporting 
advanced  application  domains  such  as  Computer-Aided  Design/Manufacturing 
(CAD/CAM),  Computer-Aided  Software  Engineering  (CASE),  Geographical  Information 
Systems  (GIS)  and  Multimedia  data  management,  have  motivated  research  on  the 
so-called  next-generation  DBMSs  [ACM91].  One  class  of  next-generation  DBMSs  are 
Object-  Oriented  DBMSs  (OODBMSs),  which  combine  DBMS  functionalities  with 
features  of  the  object-oriented  programming  paradigm  such  as  data  abstraction, 
encapsulation,  inheritance  and  polymorphism.  In  recent  years,  many  research  efforts  have 
been  made  towards  (i)  extensible  OODBMSs,  such  as  EXODUS  [Car90],  PROBE 
[Ore88],  DASDBS  [Sch90],  POSTGRES  [Sto91],  OpenOODB  [Wel92]  and  Starburst 
[Loh91];  (ii)  active  OODBMSs,  (i.e.,  systems  with  the  ability  to  perform  a  set  of 
operations  triggered  by  the  detection  of  certain  events),  such  as  ODE  [Geh96],  Sentinel 
[Cha94]  and  REACH  [Buc95];  and  (in)  database  programming  languages,  which  have 
been  proposed  as  a  solution  to  the  so-called  "impedance  mismatch  problem"  [Cop84] 
caused  by  the  dissimilarities  between  the  underlying  data  models  and  programming 
paradigms  of  database  languages  and   programming  languages.  Most  of  the  existing 
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object-oriented  DBPLs  are  based  on  the  language  C++,  such  as  E  [Ric93]  and  0++ 
[Agr93].  Another  tendency,  mostly  seen  in  commercial  OODBMS,  has  been  to  use  C++ 
as  the  specification  and  method  implementation  language  of  the  so-called  "persistent 
classes"  which  are  derived  from  a  set  of  classes  which  provide  the  functionalities  of  the 
DBMS.  This  is  exemplified  by  the  supporting  languages  of  commercial  OODBMSs  like 
Ontos  [Ont91],  Versant  [Rot91]  and  ObjectStore  [Vad96]. 

1.1.2  The  Need  for  Model  Extensibility 

Despite  their  success,  the  systems  and  OODBPLs  mentioned  above  have  two  common 
limitations:  their  object  models  (i)  are  too  simple  to  capture  different  types  of  data 
semantics,  and  (ii)  have  a  fixed  set  of  modeling  constructs.  Most  of  these  systems  use 
the  model  of  an  00  programming  language  (e.g.,  C-I-+)  as  their  object  model.  The 
structural  properties  of  objects  are  defined  by  means  of  generalization  (Is-a  relationship) 
and  aggregation  (attributes  or  data  properties)  associations.  Their  behavioral  properties 
are  defined  by  method  specifications  (or  signatures).  All  operational  semantics  (i.e., 
control  and  logic)  are  implemented  and  thus  hidden  in  the  method  implementations.  In 
recent  work  on  active  DBMSs,  event-condition-action  (ECA)  rules  have  been  proposed 
and  used  in  active  DBMSs  as  an  abstraction  to  represent  some  of  the  operational 
semantics  associated  with  objects.  However,  if  the  same  types  of  operational  semantics 
are  associated  with  many  types  of  objects,  then  a  lot  of  ECA  rules  will  have  to  be 
repeatedly  written.  On  the  other  hand,  if  the  data  model  of  a  DBMS  or  DBPL  is 
extensible,  then  new  modeling  constructs  such  as  new  class  or  association  types  and  new 
constraints  specified  by  keywords  can  be  introduced  and  used  to  model  a  database 
mstead  of  expressing  their  semantics  in  method  implementations  and  rules.  Once  new 
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modeling  constructs  are  defined  and  become  part  of  an  extensible  model,  they  can  be 
easily  used  by  simply  nammg  these  constructs  in  a  database  schema  or  the  specification 
part  of  an  object-oriented  program.  Furthermore,  the  underlying  object  model  can  be 
customized  to  satisfy  the  modeling  requirements  of  a  given  application  domain  by 
extending  the  model  to  contain  just  those  needed  modeling  constructs.  This  flexibility 
reduces  the  system  complexity  and  the  associated  development  and  maintenance  costs. 

1.1.3  The  Need  for  an  Extensible  Object-Oriented  Database  Programming  Language 

The  use  of  an  OODBPL  solves  the  "impedance  mismatch  problem"  caused  by  the 
dissimilar  data  models  and  programming  paradigms  used  in  database  languages  and 
programming  languages.  This  is  achieved  by  integrating  database  definition  and 
manipulation  language  constructs  with  the  traditional  programming-language  constructs. 
Programs  written  in  an  OODBPL  are  processed  with  the  support  of  DBMS  functions  such 
as  persistence,  integrity  control,  concurrency,  recovery,  etc.  Using  an  OODBPL, 
application  developers  specify  the  properties  of  application  objects  in  terms  of  classes 
which  are  defined  by  using  the  modeling  constructs  of  the  underlying  data  model  of  the 
language.  If  the  data  model  is  extensible,  then  the  DBPL  should  also  be  extensible  in 
order  to  reflect  the  model  extensions.  In  other  words,  any  model  extension  needs  to  have 
a  representation  in  the  language  in  order  for  the  user  to  use  the  modeling  construct.  For 
example,  if  a  data  model  is  extended  to  allow  the  definition  of  a  KEY  constraint,  then 
some  keyword  like  KEY  should  be  part  of  the  language.  Traditionally,  the  addition  of 
new  features  to  a  data  model  would  require  the  redesign  or  modification  of  the  compiler 
or  interpreter  of  the  data  definition     language  (DDL),  which  involves  a  lot  of 
programming  effort.  If  the  language  is  extensible,  however,  the  addition  or  modification 
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of  the  data  modeling    constructs  should  not  require  that  the  language  translator  be 

modified. 

1.2  Overview 

1.2.1  K.3:  An  Extensible  Knowledge  Base  Programming  Language 

In  order  to  overcome  the  limitations  of  fixed  data  models  and  non-extensible  DBPLS 
found  in  the  existing  systems,  we  have  designed  and  implemented  an  extensible 
knowledge  base  programming  language  (KBPL)  called  K.3,  which  is  an  extension  of 
earlier  versions  of  the  languages  (K.l  [Arr92]  and  K.2  [Blo96,  Shy96]).  In  addition  to 
the  common  features  of  OODBPLs,  K.3  has  the  following  features: 

Extensible  object  model.  K.3's  underlymg  object  model  is  an  extensible 
object-oriented  semantic  association  model,  OSAM*/X,  which  has  an  extensible  set  of 
modeling  constructs  such  as  semantic  associations,  class  types,  class  and  association 
properties  (e.g.,  constraints),  knowledge  rules,  parameterized  rules  and  methods. 

Extensible  class  specification.  The  class  specification  component  of  K.3  is  extensible 
in  the  sense  that,  once  an  extensible  kernel  model  is  defined,  model  extensions  can  be 
included  automatically  in  the  language  without  redesigning  or  modifying  the  compiler. 

Ability  to  define  model  extensions.  K.3  provides  constructs  which  allow  a 
Knowledge-Base  Customizer  (KBC)  to  define  model  extensions.  Model  extensions  are 
defined  by  means  of  metaclasses  and  parameterized  rules.  In  K.3,  metaclasses  and 
parameterized  rules  are  treated  as  objects. 

Knowledge  base  management  system  (KBMS)  support.  Applications  compiled  by 
the  K.3  compiler  and  the  compiler  itself  are  supported  by  a  library  of  KBMS  functions 
which   are   used   for  the   storage   and   manipulation   of  data-   modeling  and 
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application-schema  knowledge.  Instead  of  the  traditional  "inclusion"  of  specifications, 
the  K.3  compiler  is  driven  by  the  data-modeling  knowledge  retrieved  from  a  knowledge 
base.  After  a  successful  compilation,  the  compiler  will  update  the  knowledge  base  with 
the  knowledge  about  the  schema  which  has  been  defined  (i.e.,  the  metadata). 

K.3  is  intended  to  be  used  by  many  types  of  users  such  as  application  developers 
(ADs),  knowledge-base  administrators  (KBAs)  and  knowledge-base  customizers  (KBCs). 
ADs  will  find  in  K.3  a  complete  language  for  the  implementation  of  client  applications 
without  having  to  worry  about  the  details  of  mapping  the  language  constructs  to  KBMS 
commands.  A  KB  A  uses  K.3  for  the  specification  and  implementation  of  the  conceptual 
schema  m  a  knowledge  base,  using  a  high-level  semantic  model  defined  by  the  KBC, 
who  uses  K.3  to  define  model  extensions. 

1.2.2  Model  Extensibility 

The  object  model  of  K.3  is  extensible  in  the  sense  that,  once  an  extensible  kernel 
model  is  defined  and  the  underlying  KBMS  and  the  compiler  are  properly  extended  to 
support  it,  new  modeling  constructs  can  be  added  without  requiring  a  programming  effort 
to  change  the  system  implementation.  In  the  current  implementation,  model  extensions  are 
compiled  with  the  K.3  compiler,  then  the  resulting  executable  code  is  linked  together  with 
the  other  components  of  the  compiler  (e.g.,  parser,  semantic  checker,  code  generator)  to 
create  a  new  compiler  which  recognizes  the  model  extensions.  Currently,  the  model  can 
be  extended  with  new  class,  association  and  constraint  types. 

It  has  been  proposed  that  the  semantics  of  data  modeling  constructs  such  as  constraints 
be  represented  by  means  of  production  rules  [Pat93].  Our  approach  makes  use  of 
parameterized  rules  to  represent  the  general  semantics  of  new  modeling  constructs.  Once 
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a  modeling  construct  is  used  in  the  definition  of  a  specific  class,  the  corresponding 
parameterized  rules  are  translated  into  explicit  rules  which  are  "bound"  to  the  class  to 
represent  the  class-specific  semantics. 

K.3  uses  metaclasses  to  capture  the  semantics  of  all  the  modeling  constructs  provided 
in  the  kernel  (basic)  model  as  well  as  all  the  model  extensions.  Thus,  all  the  modeling 
constructs  and  all  the  properties  specified  in  a  schema  (e.g.,  associations,  constraints, 
methods  and  rules)  are  represented  by  objects.  By  using  a  fuUy-fledged  object-oriented 
representation  like  in  GOOSE  [Mor92],  the  model  itself  is  defined  using  the  modeling 
constructs  of  the  kernel  model.  Such  a  model  of  the  object  model  is  called  the 
metamodel. 

Using  the  metamodel,  a  model  extension  is  carried  out  by  defining  its  semantics  by 
means  of  a  metaclass  which  contains:  (i)  attributes  (or  aggregation  associations)  for 
describing  the  data  properties  of  the  model  extension,  (ii)  methods  for  implementing 
auxiliary  procedures  and  functions  used  during  binding  (e.g.,  return  the  names  of  all  the 
superclasses  of  a  class),  (iii)  rules  for  specifying  model  constraints  (e.g.,  an  inheritance 
lattice  cannot  have  any  cycles),  and  (iv)  parameterized  rules  for  specifying  the  semantics 
of  the  extended  construct. 

1.2.3  Language  Extensibility 

Traditionally,  the  underlying  data  model  of  a  DBMS  is  not  extensible.  Any  model 
extension  would  require  a  programming  effort  to  change  the  specification  language  (or 
"data  definition  language"  or  DDL)  of  the  DBMS  to  support  the  extension.  We  have 
designed  K.3  in  such  a  way  that  the  model  extensions  to  a  predefined  extensible  kernel 
model  are  reflected  and  can  be  used  in  the    language  without  having  to  change  the 
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implementation  of  the  compiler  (e.g.,  parser,  semantic  checker,  etc.)  We  have  achieved 
this  by  providing  the  following  features: 

Compilation-time  update  and  retrieval  of  the  knowledge  about  the  model.  The 
compiler  is  capable  of  retrieving  and  updating  model  knowledge  stored  in  a  knowledge 
base  (KB)  during  compilation  time.  This  knowledge,  represented  in  the  form  of 
metaclasses,  drives  the  compilation  process. 

Metaclass  names  treated  as  keywords.  Metaclass  names  defined  in  the  KB  are 
treated  by  the  compiler  as  if  they  were  keywords,  thus  allowing  the  user  to  use  them  to 
define  a  database  without  having  to  change  the  implementation  of  the  compiler. 

Compilation-time  expression  evaluation.  The  K  3  compiler  can  evaluate  expressions 
during  compilation  time.  This  feature  is  important  as  it  allows  the  creation  of  objects  in 
the  dictionary  while  compiling  specifications.  Objects  in  the  dictionary  are  created  as  the 
result  of  evaluating  expressions  at  compilation  time.  Dictionary  objects  represent 
specifications  in  defined  in  K.3. 

Macros.  Macros  are  higher-level  abstractions  which  improve  readability  of 
specifications  by  hiding  complicated  expressions.  Macros  are  expanded  at 
compilation-time  into  expressions  which  are  also  evaluated  at  compilation  time,  as 
explained  above. 

1.3  Dissertation  Organization 

The  rest  of  this  dissertation  is  organized  as  follows.  In  the  next  chapter,  we  shall 
present  a  survey  of  related  works.  In  Chapter  3,  we  present  the  object  model  of  K.3.  In 
Chapter  4,  the  language  K.3  is  presented.  Chapter  5  presents  in  more  detail  the  concepts 
of  model  and  language  extensibility.  The  implementation  details  are  presented  in  Chapter 
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6.  In  Chapter  7,  we  present  the  results  of  using  model  and  language  extensibility 
mechanisms  to  introduce  new  association  and  constraint  types  useful  for  data  modeling 
and  work  flow  modeling  and  management.  Finally,  our  conclusions  and  plans  for  future 
research  are  presented  in  Chapter  8. 


CHAPTER  2 
SURVEY  OF  RELATED  WORKS 

Our  work  is  related  to  several  existing  works  on  model  extensibility,  extensible 
OODBMS,  active  DBMS  and  object-oriented  database  programming  languages. 

2.1  Related  Works  on  Model  Extensibility 

The  work  by  Klas  et  al.  in  the  distributed  DBMS  VODAK  [Kla90]  is  closely  related 
to  ours  in  that  metaclasses  are  used  as  the  basis  for  model  extensibility.  The  VODAK  data 
model,  VDM,  can  be  extended  by  defining  different  types  of  classes  (e.g.,  generalization 
classes  and  relationships)  using  metaclasses  to  capture  their  semantics.  Unlike  our 
approach,  VODAK  uses  only  methods  to  implement  the  semantics  represented  by 
metaclasses.  Semantic  relationships  are  defined  using  classes  instead  of  our  semantic 
association  approach.  Its  definition  language,  VDL,  is  not  extensible,  therefore  the 
implementation  of  its  interpreter  needs  to  be  modified  in  order  to  reflect  model  extensions. 

SORAC  [Pec95]  is  an  extensible  data  modeling  system  which  uses  the  mapping 
approach  to  convert  high-level  semantics  into  implementation.  A  special  emphasis  was 
put  in  relationships  and  their  constraints.  Instead  of  having  a  single  specification 
language,  different  languages  (ARAC  and  DSDT)  are  used  to  represent  schemas  which 
are  translated  into  another  language  called  OLI.  OLI  is  a  lower-level  language  which 
supports  the  concept  of  class,  type-less  associations  (called  participants)  and  monitors. 
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Monitors  are  similar  to  OSAM*/X's  rules,  but  more  restricted  in  the  sense  that  only 
system-defined  (not  user-defined)  operations  can  be  specified  as  events,  and  a  limited  set 
of  operations  can  be  specified  in  the  action  clause.  New  types  of  relationships  are  added 
by  implementing  mappings  from  the  high-level  languages  to  OLI. 

ADAM  [Dia94,Pat93]  is  closely  related  to  our  work  in  the  use  of  metaclasses  and 
rules  for  representing  implicit  semantics.  However,  ADAM  uses  different  approaches  for 
translating  semantics  into  events,  conditions  and  actions.  Horn  Logic  is  used  to  map  from 
specifications  to  events,  while  predicates  and  composition  functions  are  used  for  the 
translation  of  conditions.  Rule  templates,  similar  to  OSAM*/X's  parameterized  rules,  are 
used  to  generate  the  action  part  of  the  rule.  Its  specification  language  is  based  on  Prolog. 
Actually,  Prolog  is  used  as  the  knowledge-base  management  engine  of  ADAM,  which 
leads  to  all  the  inefficiency  and  undecidability  problems  as  discussed  by  Sebesta  [Seb96]. 

GOOSE  [Pec93]  is  an  extensible  data  modeling  environment  with  emphasis  on  schema 
evolution  and  version  control.  Like  our  approach,  metaclasses  are  used  as  the  base  for 
extensibility.  Like  other  systems,  inheritance  of  attributes  and  methods  is  used  to  facilitate 
extensibility.  Instead  of  a  semantically  rich  model,  GOOSE  has  a  conventional 
object-oriented  data  model. 

2.2  Extensible  OODBMs 

Most  of  the  works  on  extensible  OODBMSs  emphasize  more  on  system  extensibility 
than  on  model  extensibility.  In  EXODUS  [Car90]  and  DASDB  [Sch90],  a  "toolkit" 
approach  is  used  for  achieving  extensibility  by  providing  a  set  of  modules  with 
well-defined  interfaces  which  are  used  to  customize  a  DBMS.  Starburst  [Loh91]  and 
POSTGRES  [Sto91]  were  eariy  attempts  to  include  object-  oriented  concepts  in  the 
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context  of  the  relational  model.  System  extensions  are  achieved  by  adding  functional 
extensions  in  the  DBMS  and  a  set  of  mappings  from  the  extensions  (i.e.,  new  data  types, 
query  operators)  to  the  primitives  understood  by  the  system.  In  OpenOODB  [Wel92],  an 
object-oriented  open-architecture  approach  is  used  to  achieve  system  extensibility  by 
providing  a  set  of  classes  comprising  a  meta-architecture,  which  aims  to  satisfy 
metarequirements  (openness,  seamlessness,  reusability  and  transparency),  and  a  set  of 
extenders  which  aim  to  satisfy  functional  requirements  such  as  persistence,  naming, 
transaction  control,  query,  etc.  The  design  of  the  supporting  KBMS  of  K.3  is  based  on 
these  principles. 


2.3  Active  DBMS 

Active  DBMS  are  related  to  our  research  in  their  support  of  ECA  rules  to  represent 
procedural,  event-based  semantics.  Some  active  database  systems  extend  the  traditional 
relational  systems  with  rules.  Such  systems  include  Ariel  [Han92],  Starburst  [Loh91]  and 
POSTGRES  [Sto91].  In  these  systems,  the  query  language  is  extended  to  support  the 
definition  of  rules  which  are  triggered  by  system-defined  database  operations  like  create, 
update,  retneve,  etc.  In  Ariel,  rules  are  processed  using  a  set-oriented  rule  processing 
strategy  called  recognize-act  cycle,  in  which  applicable  rules  are  identified  and  processed 
against  applicable  tuples,  until  no  more  applicable  rules  can  be  identified.  Starburst  uses 
both  a  set-onented  processing  strategy    and  a  tuple-  or  instance-oriented  processing 
strategy  (the  former  supported  by  the  Alert  system)  uses  active  (or  append-only)  tables 
to  identify  events.  Both  Ariel  and  Starburst  provide  means  for  defining  rule  priority.  In 
Ariel,  a  priority  value  between  -1000  and  1000  can  be  defined  for  each  rule,  whereas  a 
precedes  clause  in  Starburst  is  used  to  establish  precedence  relationships.   In  K.3,  we 
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adopt  Ariel's  way  to  establish  rule  priority.  POSTGRES  has  two  alternate 
implementations;  one  does  tuple-oriented  processing  and  the  other  performs  set-oriented 
rule  processing  using  a  query  rewrite  system  (QRS). 

Other  active  database  systems  like  HiPAC  [Day96],  Sentinel  [Cha94],  REACH 
[Buc95]  and  ADAM  [Pat93]  are  object-oriented  DBMSs  with  rule  processing  capabilities. 
Taking  advantage  of  the  00  paradigm,  rules  are  treated  as  first-class  objects.  The  same 
approach  is  used  in  K.3.  In  these  systems,  more  advanced  event  types  (e.g.,  composite, 
clock,  external)  and  coupling  modes  (e.g.,  immediate,  deferred,  decoupled)  are  supported. 

2.4  Object-oriented  DBPLs 

None  of  the  known  OODBPLs  address  the  issue  of  model  extensibility  or  language 
extensibility.  Many  of  the  existing  OODBPLs  are  supersets  of  the  language  C++.  This 
includes  the  support  languages  of  commercial  OODBMs  such  as  Ontos  [Ont91],  Versant 
[Rot91]  and  ObjectStore  [Vad96],  and  the  languages  E  [Ric93]  and  0++  [Agr93]. 
Different  from  E  and  0++,  in  which  persistence  is  a  class  property  rather  than  an  object 
property,  K.3  allows  some  objects  of  a  class  to  be  persistent  and  others  to  be  transient. 
0++,  like  K.3,  allows  the  definition  of  rules  in  the  form  of  constraints  and  triggers.  In 
many  C-H--based  OODBPLs,  a  navigational  (procedural)  approach  is  taken  for  the 
retrieval  of  data.  Instead,  K.3  uses  a  declarative,  high-level  association-  or  pattem-based- 
query  language.  Since  the  data  model  of  these  languages  is  the  C++  model,  objects  are 
accessed  by  means  of  pointers  instead  of  address-independent  unique  object  identifiers 
(oids). 

C02  [Lec89]  is  the  support  language  of  the  02  system.  Rather  than  being  based  on 
C++,  C02  has  a  different  syntax  and  data  model.  Like  K.3,  it  supports  orthogonal 
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persistence,  i.e.,  persistence  is  an  object  property  rather  than  a  class  property.  It  provides 
both  a  navigational  construct  (the  "for"  statement)  for  procedural  retrieval  of  data,  and 
a  query  language  based  on  SQL  for  declarative  data  retrieval.  The  data  model  of  C02  is 
simpler  than  that  of  K.3  in  that  only  classes,  types,  tuples  and  methods  are  used  to  specify 
schemas. 


CHAPTER  3 
OBJECT  MODEL 

In  this  chapter,  we  present  the  object  model  of  K.3.  We  start  with  an  overview  of  the 
model,  then  discuss  the  concept  of  model  extensibility.  The  model  of  the  model,  or 
metamodel,  will  also  be  described.  Our  approach  to  model  extensibility  will  be  described, 
followed  by  a  brief  discussion  on  the  current  limitations  in  our  approach. 

3.1  Overview 

The  underlying  object  model  of  K.3  is  an  extensible  Object-oriented  Semantic 
Association  Model,  OSAM*/X,  which  is  an  extension  of  OSAM*  [Su89].  The  modeling 
constructs  of  OSAM*/X  can  be  summarized  as  follows: 

Objects.  The  extension  of  a  database  defined  by  OSAM*/X  can  be  viewed  as  a 
network  of  interconnected  objects.  Objects  are  things  of  interest  in  an  application's  world, 
including  physical  entities,  processes,  abstract  things,  relationships  and  values.  Objects 
are  interconnected  by  means  of  semantic  associations,  which  specify  the  semantics  of 
their  relationships. 

Classes.  Classes  are  an  abstraction  used  to  group  objects  that  have  common  properties. 
A  class  IS  used  to  represent  the  common  properties  of  a  group  of  objects  in  terms  of  their 
structural  and  operational  semantics.  These  properties  are  described  by  means  of  semantic 
associations,  methods  and  rules,  as  shall  be  described  below.  The  data  representation  of 
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an  object  in  a  class  is  called  the  instance  of  the  object  in  that  class.  The  type  of  a  class 
describes  the  properties  that  are  common  to  all  the  mstances  of  the  class.  Four  types  of 
classes  are  defined  in  the  kernel  object  model:  (i)  entity  classes,  which  define  and 
contain  objects  that  have  identity  (i.e.,  they  can  exist  in  the  database  by  their  own 
independent  of  other  objects  and  are  uniquely  identified  by  system-assigned  object 
identifiers  (oids))  and  represent  real-world  entities;  (ii)  domain  classes,  which  represent 
objects  that  do  not  exist  in  the  database  by  themselves  but  are  used  to  define  the 
descriptive  properties  (i.e.,  attribute  values)  and/or  association  properties  (i.e.,  object 
associations)  of  entity  objects,  (iii)  schema  classes,  which  act  as  containers  of  schema 
objects  (i.e.,  objects  that  represent  schemas  defined  in  a  knowledge  base),  and  (iv) 
(Aggregate  classes  which  represent  collection  of  objects  like  Set,  List,  Array,  etc. 

Associations.  Semantic  associations  represent  relationships  among  objects.  An 
association  is  defined  by  a  set  of  links,  each  of  which  is  directed  from  the  class  where 
it  is  defined-called  the  defining  class-to  each  of  the  constituent  classes.  The  type  of 
an  association  describes  the  semantics  of  the  relationship  it  represents.  Two  types  of 
associations  are  part  of  the  kernel  object  model:  Generalization  and  Aggregation.  A 
Generalization  association  type  is  used  to  model  the  relationship  that  a  constituent  class 
is  a  subclass  of  the  defining  class  (the  superclass),  and  the  subclass  inherits  all  the 
properties  (associations,  methods  and  rules)  of  the  superclass.  It  can  also  be  said  that  the 
subclass  is  a  specialization  of  the  superclass  An  Aggregation  association  type  models  the 
relationship  that  an  object  of  the  constituent  class  (entity  or  domain)  is  the  value  of  a  data 
attribute  ("data  member")  of  the  defining  class  Notice  that  these  two  association  types 
are  common  in  most  00  data  models 
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Methods.  Methods  are  used  to  implement  some  operational  semantics  of  objects.  A 
method  consists  of  a  signature  (specification)  which  includes  the  method  name,  the 
names  and  types  of  its  parameters  and  a  return  type,  and  a  body  which  is  the 
implementation  of  the  method.  A  method,  when  executed,  may  change  the  state  of  an 
object  .  A  method  is  executed  by  sending  a  message  to  the  desired  object.  The  message 
contains  the  method  signature  and  the  parameter  values. 

Rules.  Rules  are  used  to  define  the  operational  semantics  of  objects  at  the 
specification  level  instead  of  the  implementation  level.  A  rule  is  an  abstraction  which 
represents  a  set  of  actions  to  be  performed  when  certain  events  occur  and  certain  data 
conditions  are  met  (i.e.,  the  rule  is  triggered).  Rules  used  in  K.3  are 
Event-Condition-Action-AltemativeAction  (ECAA)  rules.  An  event  can  be  a 
system-defined  operation  such  as  create,  insert,  update  or  delete,  or  a  user-defined 
operation  specified  by  a  method.  The  copuling  mode  of  a  rule  indicates  when  the  rule 
is  to  be  triggered  with  respect  to  an  event,  i.e.,  before  the  event,  after  the  event,  etc.  The 
event  specification  of  a  rule  indicates  which  events  cause  the  rule  body  to  be  triggered. 
The  rule  priority  specifies  the  priority  of  triggering  the  rule  with  respect  to  others.  A 
priority  number  is  an  integer  used  to  assign  prionty  to  a  rule,  with  a  higher  number 
indicating  a  higher  priority.  A  condition  in  a  rule  is  a  predicate  expression  which 
specifies  the  condition(s)  to  be  tested  upon  the  triggering  of  the  rule  and  used  to 
determine  whether  to  perform  the  rule's  action  (if  the  condition  is  TRUE)  or  the  rule's 
alternative  action  (if  the  condition  is  FALSE).  Rules  are  processed  at  the  instance  level, 
i.e.,  they  are  processed  against  the  instance  for  which  an  event  occurred.  Two  types  of 
rules  are  supported  in  OSAM*/X:  explicit  rules,  which  are  defined  explicitly  in  classes, 
and  parameterized  ntles,  which  are  used  for  defining  the  semantics  of  new  modeling 
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constructs  and  are  bound  to  classes  that  have  the  defined  semantic  properties.  The  second 
type  of  rules  will  be  discussed  later  in  this  dissertation. 

Figure  3-1  presents  an  example  of  a  semantic  diagram  describing  a  PC  Board 
manufacturmg  database.  In  this  example,  the  database  is  defined  using  only  the  modeling 
constructs  of  the  kernel  model  (i.e.,  without  extensions).  A  semantic  diagram  is  a 
graphical  representation  of  the  semantic  description  (or  schema)  of  a  database.  In  the 
semantic  diagram,  classes  are  represented  by  geometric  figures,  and  associations  are 
represented  by  straight  lines  (called  links).  Other  symbols  are  used  to  describe  additional 


vend  orC  ode 


String 


O Integer 
String 


Integer 
^^^O  String 

lastCostQ  ugg, 


Resistor 


resistance^ 


o 

Real 


Capacitor 


capacitance 

^  o 

Real 


pinLayout 
IntegratedCircuit  A  O 


Layout 


Board 


dimensions 

^  O  Dimension 

A 

^•i^h  OReal 


length  Q 


Real 


thickness 


OReal 

Figure  3-1.  Semantic  diagram  of  PC  Board  Manufacturing  Schema  defined  using  only 

kernel  model. 


18 

semantics,  as  shall  be  discussed  later.  A  rectangle  is  used  to  represent  an  entity  class,  like 
the  class  Part  in  the  figure.  Domain  classes  (e.g.,  the  class  String  in  the  figure)  are 
represented  by  circles.  Notice  that  each  figure  is  labeled  with  the  name  of  the  class  it 
represents.  Other  class  types  defined  by  model  extensions  are  represented  by  other 
geometric  shapes.  Links  that  represent  associations  are  labeled  with  the  name  of  the 
association  and  a  mnemonic  indicating  its  association  type  (for  example,  G  stands  for 
Generalization,  and  A  for  Aggregation).  In  the  figure,  the  class  Part  is  defined  as  a 
superclass  of  the  classes  Resistor,  Capacitor,  IntegratedCircuit  and  Board,  by  means  of 
G  associations.  Aggregation  associations  are  used  to  define  the  attributes  of  classes.  For 
example,  the  attributes  "partNo"  and  "partDescr"  are  defined  by  means  of  A  associations 
between  the  entity  class  Part  and  the  domain  classes  Integer  and  String,  respectively. 

The  class  VendorPart  is  used  to  model  a  relationship  as  an  entity  object.  A  VendorPart 
object  instance  represents  a  relationship  between  a  Vendor  and  a  Part  object,  recording 
the  fact  that  a  vendor  supplies  a  part,  with  the  cost  given  by  the  attribute  "partCost". 

Operational  (behavioral)  semantics  are  defined  by  means  of  rules  and  methods. 

Showing  either  rules  or  methods  in  the  semantic  diagram  can  make  it  look  messy. 

Therefore  a  separate  syntactic  representation  is  preferred.  The  following  rule  specifies  that 

a  Part  cannot  have  a  null  part  number: 

rule  Part::noNullPartNo 

triggered  on_comniit  create(),  update(partNo) 

condition  partNo.isNull() 

action 

abort_tx("NULL  PART  NUMBER"); 
end; 

The  above  rule  is  defined  using  the  K.3  language.  This  rule,  defined  m  the  class  Part, 
is  triggered  upon  the  commit  of  the  transaction  in  which  either  the  operation  create()  or 
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updateO  is  performed.  These  two  operations  are  system-defined  methods  implementing 

the  semantics  of  object  creation  and  object  update,  respectively.    In  this  rule,  the 

system-defined  method  isNull()  is  invoked  in  the  partNo  object.  If  isNull()  returns  TRUE, 

then  the  "action"  clause  of  the  rule  is  taken,  thus  aborting  the  transaction. 

The  following  is  an  example  of  a  method  (implemented  in  K.3)  which  computes  the 

cost  of  a  product  based  on  the  present  costs  of  its  parts: 

method  Product:  :computeCost()  :  Real 
local  total  :  Real; 
begin 

total  :=  0; 

context  this  *  [parts]  p:Part  do 
total  :=  total  +  p.lastCost; 

end; 

return  total; 

end; 
end; 

This  method  uses  the  a  "context"  expression  to  obtain  all  the  Part  objects  (represented 
by  the  association  link  "parts")  associated  with  "this"  Product  object  (i.e.,  the  receiver  of 
the  message)  and  uses  the  "do"  iteration  statement  to  add  the  lastCost  value  of  those 
objects  to  the  "total"  variable,  which  will  have  as  its  value  the  total  cost  of  the  product 
at  the  end  of  the  method  execution. 

3.2  Model  Extensibility 

In  the  example  we  just  presented,  it  is  evident  that  some  semantics  cannot  be 
expressed  in  the  semantic  diagram  with  the  given  limited  modeling  constructs.  By  looking 
at  the  semantic  diagram  alone,  the  follow  questions  remain  unanswered:  (i)  are  there  any 
key  attributes  in  these  classes  ?  (ii)  is  it  possible  for  a  part  to  be  both  a  resistor  and  an 
integrated  circuit  ?  (iii)  could  the  same  part  be  supplied  by  many  vendors  ?  (iv)  are  there 
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any  referential  constraints  applicable  to  the  VendorPart  relationship  ?  Notice  that  if  the 
data  model  was  fixed,  then  these  semantics  will  need  to  be  expressed  by  one  of  the 
following  means:  (i)  give  textual  descriptions  or  comments  in  the  semantic  diagram,  (ii) 
implement  the  semantics  in  methods,  (iii)  use  rules  to  declaratively  specify  the  semantics. 
Notice  that  (i)  can  be  imprecise  or  ambiguous,  while  as  (ii)  can  be  difficult  to  understand 
or  modify.  The  approach  of  using  rules  is  preferred,  as  rules  are  part  of  the  class 
specification  and  have  a  higher  level  of  abstraction.  However,  if  the  semantics 
expressable  by  rules  are  associated  with  many  data  elements  in  a  schema  (e.g.,  rules  for 
expressing  different  types  of  constraints  on  classes  and  associations),  then  these  rules  will 
have  to  be  explicitly  and  repeatedly  specified  for  all  these  data  elements.  Clearly, 
extending  the  model  to  provide  explicit  keywords  which  implicitly  specify  their 
corresponding  rules  is  preferable. 

OSAM*/X  is  extensible  in  the  sense  that  new  modeling  constructs  can  be  added  to  the 
model  to  enhance  its  expressiveness.  New  constructs  are  added  by  explicitly  defining 
their  semantics  using  parameterized  ECAA  rules,  which  become  the  implicit  semantics 
of  the  new  modeling  constructs.  For  instance,  the  generalization  (or  G)  association  type 
has  the  following  implicit  semantics:  (i)  inheritance,  i.e.,  a  subclass  inherits  all  the 
properties  of  a  superclass,  and  (ii)  existence  dependency,  i.e.,  an  object  in  a  subclass 
cannot  exist  in  the  database  if  the  associated  object  in  the  superclass  does  not  exist. 
Notice  that  by  specifying  a  G  association  between  two  classes,  its  semantics  are  implied. 
Similarly,  one  may  define  a  new  type  of  association,  X,  which  does  not  imply 
inheritance  but  has  the  existence-dependency  property.  By  using  an  association  of  type 
X,  these  semantics  are  implied  by  the  association  type. 
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OSAM*/X  is  extensible  in  many  aspects,  including  the  ability  to  define  (i)  new 
association  types,  as  described  above;  (ii)  new  types  of  classes  which  share  common  class 
characteristics;  and  (iii)  new  constraint  types  associated  with  classes  and  associations  (e.g., 
cardinality  constraint  of  an  aggregation  association). 

For  example.  Figure  3-2  shows  a  new  schema  that  contains  several  extended 
modeling  constructs.  A  new  association  type.  Interaction  (I),  was  defined  to  capture  the 
semantics  of  relationships,  including  (i)  multiple  constituent  classes:  more  than  one 
constituent  class  is  allowed  under  the  same  association  (i.e.,  multiple  association  links); 
(ii)  referential  constraint:  an  object  in  the  defining  class  must  refer  to  (be  associated  with) 
an  object  in  each  of  the  constituent  classes;  and  (iii)  cardinality  constraint:  either  1-many, 
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extended  model. 
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many-1,  or  many-many  cardinality  constraints  may  be  specified  among  instances  of  the 
constituent  classes.  Notice  that  the  class  VendorPart  has  an  I  association  defined  with  the 
classes  Vendor  and  Part,  with  a  cardinality  constraint  of  many-many  (m:n). 

A  set  exclusion  (SX)  constraint  was  also  added  to  the  model  to  specify  that  an  object 
in  a  class  cannot  have  a  related  object  in  more  than  one  of  its  constituent  classes  in  a  set 
of  G  associations.  For  example,  there  is  an  SX  constraint  defined  in  the  G  associations 
of  the  class  Part,  recording  the  fact  that  a  Part  can  be  either  a  Resistor,  a  Capacitor,  an 
IntegratedCircuit  or  a  Board,  but  cannot  be,  for  example,  both  a  Resistor  and  a  Capacitor. 
The  model  was  also  extended  with  a  UNIQUE  (key)  constraint  to  specify  that  an  attribute 
uniquely  identifies  an  object  within  a  class.  In  the  semantic  diagram,  a  UNIQUE 
constraint  is  represented  with  double  lines  (=)  crossing  the  link  that  represents  the 
attribute,  The  semantics  implied  in  this  constraint  are  that  if  the  attribute  of  an  object  has 
value  X,  then  no  other  object  can  have  the  same  attribute  value  X. 
Another  model  extension  was  done  to  define  indices  associated  with  classes.  Although 
indices  can  be  considered  more  an  implementation  issue  than  a  specification  issue, 
including  indices  in  the  specification  provides  more  explicit  information  on  how  a 
database  is  designed,  what  processing  capability  is  required  in  the  DBMS,  and  how  query 
processing  can  be  optimized.  In  the  semantic  diagram,  indexing  over  an  attribute  is 
represented  by  an  oval  drawn  over  the  association  link  that  defines  the  attribute. 
The  above  model  extensions  need  to  be  added  to  the  kernel  model  and  their  semantics 
need  to  be  defined.  The  approach  we  take  to  achieve  this  model  extensibility  is  to  extend 
a  metamodel  and  use  parameterized  rules  to  define  the  semantics  of  the  new  constructs. 
We  shall  explain  this  approach  in  the  following  sections. 
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3.3  The  Metamodel 

A  metamodel  is  a  model  which  defines  the  syntax  and  semantics  of  a  semantic 
model.  The  metamodel  of  K.3  is  defined  by  using  the  modeling  constructs  of  the  kernel 
model  of  OSAM*/X.  In  this  approach,  a  number  of  metaclasses  are  defined  to  specify  the 
syntax  and  semantics  of  all  the  modeling  constructs  of  OSAM*/X,  including  classes, 
associations,  methods  and  rules  (both  explicit  and  parameterized  rules).  This  approach 
has  many  advantages.  Firstly,  by  defining  the  metamodel  using  the  kernel  model  itself, 
we  can  use  K.3  to  define  model  extensions.  Secondly,  the  definition  of  the  metamodel 
using  K.3  facilitates  the  use  of  the  functionalities  of  the  supporting  KBMS  to  store  and 
retrieve  knowledge  about  the  kernel  model  and  model  extensions.  By  the  same  token, 
since  metaclasses  contain  the  knowledge  about  application  schemas,  other  KBMS  clients, 
such  as  a  graphical  user  interface  or  an  application  system,  can  use  the  KBMS  to  retrieve 
knowledge  about  a  given  application  schema. 

Figure  3-3  shows  the  semantic  diagram  of  the  metamodel  of  the  kernel  model  of 
OSAM*/X.  The  class  Class  represents  the  semantics  of  all  types  of  classes  in  OSAM*/X. 

The  subclasses  of  Class  represent  the  different  types  of  classes.  For  example  Entity, 
Domain,  Schema  and  Aggregate  represent  entity,  domain,  schema  and  aggregate  classes, 
respectively.  Another  type  of  class,  MetaClass,  is  defined  as  a  subclass  of  Entity  and  its 
purpose  is  to  store  the  definitions  of  all  the  metaclasses  defined  in  the  model.  For 
example,  the  objects  that  represent  the  classes  Entity  and  Domain  will  have  each  an 
instance  in  the  classes  Class,  Entity  and  MetaClass.  Metaclasses  are  objects  too,  i.e.,  they 
have  a  representation  in  the  KB  in  the  form  of  objects.  Therefore,  all  metaclasses  (e.g.. 
Entity,  Domain,  etc.)  have  instances  in  the  class  MetaClass.  Since  MetaClass  is  a  subclass 
of  classes  Entity  and  Class,  objects   that  have  an  instance  in  MetaClass  also  have  an 
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Figure  3-3.  Semantic  diagram  of  kernel  metaschema 

instance  in  Entity  and  Class.  On  the  other  hand,  an  application  class  like  Part  will  have 
an  instance  in  the  classes  Class  and  Entity,  but  no  instance  in  the  class  MetaClass.  Notice 
that  an  attribute  called  "baseClass"  is  defined  m  some  metaclasses.  Its  function  is  to 
specify  a  class  that  will  act  as  the  base  class  for  all  the  application  classes  whose  class 
type  is  defined  by  a  metaclass.  For  example,  the  class  Entity  has  a  base  class  named 
EClassObject.  Therefore,  any  application  entity  class  like  Part  will  be  defined  as  a 
subclass  of  EClassObject.  Similarly,  DClassObject  is  the  base  class  for  all  Domain 
classes.  Therefore,  the  "baseClass"  attribute  of  the  class  Domain  is  defined  over 
DClassObject.  Notice  that  EClassObject  has  an  attribute  called  "oid"  which  is  the  object 
identifier  common  to  all  objects  of  entity  classes. 

The  metaclass  Assoc  defines  the  semantics  of  all  types  of  associations.  The  classes 
Generalization  and  Aggregation  are  subclasses  of  Assoc,  each  representing  the  semantics 
of  the  G  and  A  association  types,  respectively.  The  attribute  "definingClass"  in  Assoc 
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specifies  the  defining  class  of  an  association  as  its  value,  thus  having  the  class  Class  as 
its  type.  The  class  AssocLink  represents  an  association  link  in  an  association.  Each 
association  link  belonging  to  the  same  association  instance  will  have  an  instance  in 
AssocLink,  and  will  be  associated  with  the  corresponding  Assoc  object  by  the  link 
"defmingAssoc". 

Other  metaclasses,  such  as  Rule,  ParamRule  and  Method,  represent  explicit  rules, 
parameterized  rules,  and  methods,  respectively.  Notice  that  each  of  these  metaclasses 
models  rules  or  methods  in  terms  of  their  relationships  with  other  metaclasses  and  their 
properties  which  are  used  to  describe  their  semantics.  For  example,  the  attribute 
"triggerConds"  in  the  class  Rule  represents  the  trigger  conditions  specified  in  a  rule. 

3.4  Realization  of  Model  Extensibility 

Our  approach  to  model  extensibility  uses  procedural  semantics.  The  semantics  of 
model  extensions  can  be  represented  in  terms  of  events,  conditions  and  actions.  Events 
include  both  system-defined  and  user-defined  operations.  One  can  specify  the  semantics 
of  a  modeling  construct  by  identifying  which  events  the  semantics  depend  on.  For 
example,  if  the  semantics  depend  on  the  value  of  an  attribute,  then  create  and  update 
are  related  events.  Given  an  event,  there  may  be  some  conditions  that  need  to  be  satisfied 
before  enforcing  the  desired  semantics.  Clearly,  Event-Condition-Action  (ECA)  rules  are 
an  useful  abstraction  to  represent  these  semantics. 

When  defining  a  model  extension,  one  needs  to  identify  the  following;  (i)  which 
events  in  the  database  may  affect  the  desired  semantics,  (ii)  what  are  the  conditions  that 
must  be  satisfied  in  order  to  capture  the  semantics,  and  (iii)  what  are  the  actions  that  need 
to  be  performed.  For  example,  consider  the  MAXOBJ(n)  constraint  which  specifies  that 
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a  class  cannot  have  more  than  "n"  instances.  The  only  possible  event  that  affects  the 
semantics  is  the  "create"  event,  since  the  constramt  can  be  violated  only  when  adding 
more  objects  to  a  class.  One  condition  that  is  checked  to  enforce  the  semantics  is  that 
the  number  of  objects  after  "create"  exceeds  the  specified  maximum.  Depending  on  the 
constraint  enforcement  policy,  the  action  can  be  either  to  delete  the  object  or  to  abort  the 
transaction. 

As  we  have  presented  earlier,  modeling  constructs  have  implicit  semantics.  Therefore, 
there  exists  a  set  of  implicit  rules  that  capture  the  semantics  of  a  modeling  construct. 
To  facilitate  model  extensions,  there  must  be  a  mechanism  which  allows  the  definition 
of  rules  in  some  form  and  is  generally  applicable  to  all  the  uses  of  an  extended  construct. 
We  use  parameterized  rules  for  this  purpose.  Parameterized  rules  are  defined  in  a 
metaclass  to  specify  the  intended  semantics. 

A  parameterized  rule  is  similar  in  form  to  a  regular  ECAA  rule,  with  the  following 

differences:  (i)  binding  parameters  are  used  instead  of  class-  specific  properties,  (ii)  a  set 

of  applicable  classes  are  specified,  and  (iii)  a  set  of  binding  conditions  may  be  given.  The 

following  is  an  example  of  a  parameterized  rule  (defined  in  K.3)  that  enforces  the 

MAXOBJ(n)  constraint: 

parain_niie  MaxObjectConstraint:  :maxl 
bind_classes  @defmmgClassName() 
bindjf  @definingClassType()="Entity" 
triggered  after  create() 

condition  (context  @definingClassName()).count()  >  @maxCount 
action 

abort_tx("MAXOBJ  constraint  violated"); 

end; 

This  rule  named  "maxl"  is  defined  in  the  metaclass  MaxObjectConstraint.  The 
symbol  @  is  used  to  indicate  the  binding  parameters.  It  is  used  to  bind  the  rule  to  the 
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applicable  classes  specified  in  the  "bind_classes"  clause.  Notice  that  parameters  can  be 

bound  to  methods  like  defmingClassName(),  which  returns  the  name  of  the  defining  class 

in  an  association.  The  "bind  if  clause  specifies  a  condition  which  needs  to  be  satisfied 

in  order  to  translate  the  parameterized  rule  into  an  explicit  rule  for  the  defining  class. 

In  this  example,  it  is  stated  that  the  class  in  question  (defining  class)  must  be  of  type 

"Entity".  This  is  tested  by  mvokmg  the  system-defined  method  definingClassType().  If 

a  MAXOBJ(IOO)  constraint  is  specified  in  class  X,  for  example,  the  following  rule  will 

be  bound  to  class  X: 

rule  X::maxl 

triggered  after  create() 

condition  (context  X).count()  >  100 

action 

abort_tx("MAXOBJ  constraint  violated"); 

end; 

Notice  that  the  count()  method  returns  the  number  of  objects  contained  in  the  set 
specified  by  the  expression  "context  X",  which  returns  all  the  objects  contained  in  class 
X. 

3.5  Aspects  of  Extensibility  Supported 

In  our  approach  to  model  extensibility,  it  is  predetermined  which  constructs  of  the 
model  are  going  to  be  extensible.  This  knowledge  drives  the  definition  of  the  kernel 
model  as  well  as  the  implementation  of  the  underlying  KBMS  and  the  compiler,  since 
those  constructs  that  are  going  to  be  extensible  may  require  additional  functionality  from 
the  underlying  KBMS,  as  well  as  more  functionality  from  the  compiler  itself.  Once  the 
kernel  extensible  model  is  defined,  and  the  KBMS  and  compiler  functionality  properly 
extended,  model  extensions  that  are  supported  would  not  require  changes  in  the  system 
implementation.  In  the  following  discussion,  when  we  refer  to  "extensible"  or  "can  be 
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extended",  we  mean  that  the  model  can  be  extended  without  requiring  a  programming 
effort  in  changing  the  system  implementation.  We  support  model  extensibility  m  the 
following  aspects: 

Class-type  extensibility.  The  kernel  model  of  OSAM*/X  has  predefined  four  types 
of  classes:  entity,  domain,  schema  and  aggregate  classes.  The  model  can  be  extended 
with  new  types  of  classes,  as  long  as  they  are  a  subtype  of  the  existing  class  types,  i.e., 
Entity,  Domain,  Schema  or  Aggregate. 

Association-type  extensibility.  New  association  types  can  be  defined  by  means  of  a 
metaclass  defined  as  a  subclass  of  the  metaclass  Assoc.  The  name  of  the  new  metaclass 
becomes  the  name  of  the  association  type,  and  the  compiler  recognizes  it  as  a  keyword. 

Class,  association  and  method  property  extensibility.  New  class,  association  and 
method  properties  can  be  defined  by  means  of  attributes  in  the  respective  metaclass.  The 
most  common  example  of  such  properties  are  constraints.  Another  example  of  this  is  the 
definition  of  indices  on  class  attributes,  as  will  be  presented  later  in  this  dissertation. 


CHAPTER  4 
THE  SYNTAX  AND  SEMANTICS  OF  K.3 

This  chapter  presents  the  syntax  and  semantics  of  K.3.  We  will  start  by  presenting 
an  overview  of  the  architecture  of  the  compiler  from  the  user's  viewpoint.  We  then  present 
the  specification  constructs  of  K.3,  followed  by  a  discussion  on  the  types  of  expressions 
supported  by  the  language,  as  well  as  how  method  implementation  is  defined.  We  then 
present  the  control  statements  of  the  language,  followed  by  a  brief  discussion  on 
preprocessing. 

In  our  representation  of  syntax,  we  will  use  the  following  notation:  (i)  keywords  are 
given  in  bold  typeface;  (ii)  optional  clauses  are  given  inside  square  brackets;  (iii) 
variable  clauses  (or  "non-terminal  symbols")  are  given  inside  the  symbols  <  and  >  ;  (iv) 
choices  within  clauses  are  delimited  by  bars  (  |  )  ;  (v)  if  one  of  the  above  symbols  is  a 
literal  ,  it  will  be  given  in  bold  typeface;  and  (vi)  repeating  patterns  are  specified  using 
three  dots  ( ..). 

4.1  Architecture 

Contrary  to  the  traditional  DBMS  architecture,  in  which  applications  written  in 
dissimilar  languages  interact  with  a  DBMS  by  using  embedded  queries,  K.3  exploits  the 
concept  of  transparency.  Objects  are  seen  by  the  application  developer  as  being  managed 
by  the  application  rather  than  by  the  DBMS.  The  application  developer  does  not  have  to 
specify  mappings  between  the  data  representation  of  the  DBMS  and  the  data 
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representation  of  the  language.  Objects  from  the  same  class  can  be  declared  either  as 
persistent  (i.e.,  they  remain  stored  in  the  database  after  the  termination  of  a  program),  or 
transient  (i.e.,  memory  resident).  Both  type  of  objects  have  the  same  properties  with 
respect  to  structure,  transaction  control  (i.e.,  the  state  of  both  persistent  and  transient 
objects  can  be  rolled  back  by  the  abortion  of  a  transaction)  and  query  processing  (i.e., 
both  transient  and  persistent  objects  are  returned  by  queries).  In  addition,  by  using 
OQL-based  query  expressions,  declarative  association-  based  retrieval  can  be  specified 
instead  of  the  common  navigational  approach  of  many  OODBPLs. 

Figure  4-1  presents  the  architecture  of  applications  written  in  K.3.  Each  K.3 
application  is  linked  with  a  KBMS  library,  which  provides  functionalities  such  as  object 
management,  transaction  management,  event  detection  and  query  processing.  At  run  time, 
methods  are  invoked  on  objects  whose  implementation  will  use  the  services  of  the 
KBMS  library.  In  order  to  provide  a  common  repository  of  data,  functions  in  the  KBMS 
library  use  the  services  of  a  Storage  Management  (SM)  server,  which  manages  the  data 
stored  in  the  knowledge  base  (KB).  The  SM  provides  the  functionalities  of  storage 
(get/put  semantics),  recovery  and  concurrency  control. 

The  K.3  compiler,  like  K.3  applications,  is  linked  with  the  KBMS  library.  This 
enables  the  compiler  to  use  the  facilities  of  the  KBMS  functions  for  the  storage  and 
retrieval  of  data-model  knowledge  (i.e ,  the  kernel  model  and  its  extensions),  and 
application-schema  knowledge  (i.e.,  the  dictionary  containing  the  specification  of  all  the 
schemas  defined  in  the  KB).  Before  compilation,  the  compiler  retrieves  from  the  KB  the 
description  of  the  current  semantic  model,  which  is  used  to  bind  the  code  that 
implements  the  semantics  defined  in  a  schema.  After  compilation,  the  compiler  updates 
the  KB  with  information  about  the  defined  schema. 
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Figure  4-1.  Architecture  of  applications  written  in  K.3 

K.3  is  intended  to  be  a  universal  specification  and  implementation  language  to  be  used 
by  different  types  of  users,  namely  knowledge  base  customizers  (KBCs),  knowledge  base 
administrators  (KBAs)  and  application  developers  (ADs).  This  concept  is  illustrated  in 
Figure  4-2.  A  KBC  uses  K.3  to  define  model  extensions.  The  resulting  model,  which  is 
stored  in  the  knowledge  base,  is  used  by  the  KBA  to  define  the  conceptual  schema, 
which  is  also  stored  in  the  knowledge  base.  An  AD  writes  his/her  application  in  K.3  by 
using  both  the  model  and  the  conceptual  schema,  and  adding  the  corresponding  "client 
classes"  or  classes  which  use  the  classes  in  the  conceptual  schema  and  implement  the 
semantics  of  the  application.  The  use  of  a  universal  language  has  two  advantages.  First, 
it  promotes  standardization  which  eases  the  communication    and  specification  of 
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Figure  4-2.  Usage  of  K.3  by  KBA,  KBC  and  AD 


requirements.  Second,  it  promotes  reusability,  as  the  specifications  of  software 
components  (classes)  are  done  using  a  common,  semantically  rich  model,  thus  allowing 
a  better  understanding  of  their  specifications. 


4.2  Specification  constructs 
4.2.1  Class  Definition  Statement 

Using  the  K.3  syntax,  users  define  schemas  by  means  of  classes  (or  model  extensions 
by  means  of  metaclasses),  which  are  in  turn  defined  in  terms  of  the  OSAM*/X  constructs 
such  as  associations,  methods  and  rules.  The  class  definition  statement  is  the  basic 
statement  of  K.3.  Figure  4.3  shows  the  general  structure  of  the  class  definition  statement 
"define".  Following  the  keyword  "define",  the  user  provides  the  name  of  the  class  being 
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define  <class_name>  :  <class  type>  in  <schema  name> 
[ 

where : 

<expression  list>; 

] 
[ 

associations : 

[  public   I  private   |  protected  :  ] 

<association  type> 
{ 

<link_name>  :  <class  name> 

[  where  {  <expression_list>  }  ] 
<link  name>  . . . 

) 

[  where  {  <expression_list>  }  ] 
[  public   I  private   |  protected  :  ] 

[  <association  type>  ...  ] 

] 
[ 

methods : 

[  public   I  private   |  protected  :  ] 

method  <method_name>   (<parameter  speo)    [   :  <class  name> 

[  where  {  <expression  list>  7  ]  ~ 
[  method  ...  ] 


rules : 

rule  <rule_name> 
triggered  <event-spec> 
[     condition  <expression> 
[  action  <statements>  ] 
[     otherwise  <statement>  ] 

[  rule  .  .  .  ] 

] 

end  [  <class  name>  1  ; 


Figure  4-3.  The  general  structure  of  the  K.3  class  definition  statement 

defined,  followed  by  the  type  of  the  class  and  the  name  of  the  schema  in  which  it  is 
defined.  The  "associations"  section  is  used  to  define  the  class  associations.  The 
"methods"  section  is  used  to  define  the  class  methods,  and  the  "rules"  section  is  used  to 
define  the  explicit  rules  (and  parameterized  rules)  of  the  class.  The  "where"  section  is 
used  to  define  other  class  properties  such  as  constraints.  Notice  that  association 
definitions  are  preceded  by  an  identifier  which  corresponds  to  the  association  type  being 
used.  This  is  one    of  the  key  features  of  K.3  that  supports  extensibility,  as  will  be 
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discussed  later.  A  "where"  clause  can  be  defined  for  each  association,  which  is  used  to 
describe  additional  association  properties  (e.g.,  constraints). 

Figure  4-4  gives  an  example  of  a  K.3  specification  done  by  a  KBC.  Our  intention 
in  this  example  is  to  give  the  reader  a  general  idea  of  using  the  language  to  define  model 
extensions  in  terms  of  metaclasses  and  parameterized  rules.  With  this  specification,  the 
KBC  has  extended  the  model  with  a  new  association  type  called  Interaction.  An 
Interaction  association  is  used  to  model  a  relationship  among  two  or  more  classes.  Notice 
that  Interaction  is  an  "n-ary"  association  type,  i.e.  a  single  association  that  involves 
multiple  constituent  classes  by  means  of  multiple  association  links.  The  Interaction 
association  type  is  useful  to  model  relationships  which  need  to  be  described  with  some 
attributes.  For  example,  each  instance  of  VendorPart  records  a  relationship  between  one 
instance  of  Vendor  and  one  instance  of  Part.  The  relationship  has  the  attribute  "partCost" 
whose  value  is  the  cost  of  a  part  supplied  by  a  vendor.  The  referential  constraint  is 
implicit  in  this  association  type,  i.e.  the  value  in  each  association  link  should  refer  to 
some  instance  in  each  of  the   constituent  classes.   Notice  that  this  constraint  type  is 


define  Interaction  :  MetaClass 
associations : 

//  define  Interaction  as  an  association  type 
//   (subclass  of  Assoc) 
SPECIALIZATION  {  Assoc   }  ; 
rules : 

//  referential  constraint  inherent  in  Interaction 
paxam_xule  referential 
bind_cl*8ses  @def iningClassName ( ) 

triggered  on_coiiinit  create  ()  ,   update  ( SlinkNames  ()  ) 
condition 

EXIST (context  this  and  ( @*constituentClassNames ( ) } ) 
otherwise 

abort_tx( "REFERENTIAL  constraint  violated  in  class  " 
+  "(§definingClassName  ( )  "  )  / 

end; 
end; 


Figure  4-4.  Specification  in  K.3  done  by  KBC 


35 

defined  by  the  parameterized  rule  "referential",  which  is  bound  to  the  defining  class 
(bind_classes  @definingClassName())  by  making  textual  substitutions  on  every 
expression  preceded  by  the  "@"  operator.  The  EXIST  macro  is  used  in  the  "condition" 
clause  of  the  rule  to  check  if  "this"  instance  (i.e.,  the  instance  to  which  the  rule  is 
applied)  has  an  association  with  at  least  one  instance  in  every  one  of  the  constituent 
classes.  If  the  condition  is  not  TRUE  (which  means  that  the  constraint  is  violated),  the 
"otherwise"  clause  of  the  rule  is  followed,  and  the  transaction  is  thus  aborted. 

In  Figure  4-5  we  give  an  example  of  a  specification  done  by  a  KBA  in  K.3.  The 
"define"  statement  is  used  this  time  to  define  the  application  class  VendorPart  in  the 
schema  ManufSchema.  The  definition  of  this  class  makes  use  of    the  Interaction 
association  defined  previously.  Notice  that  this  is  achieved  by     using  the  name 
"Interaction"  in  the  "associations"  clause  of  the  class    definition.  K.3  will  recognize 
"Interaction"  as  the  name  of  the  metaclass     Interaction  and  thus  will  bind  the 
corresponding  parameterized  rules.  Since  this  is  an  n-ary  association,  link  definitions  are 
enclosed  between  curly  braces  to  group  the  links  under  a  single  association.  Also  notice 
the  use  of  a  "where"  clause  in  the  definition  of  each  link.  K.3  provides  a  "where"  clause 
in  each  of  the  specification  constructs  to  allow  the  definition  of  additional  properties  that 
make  use  of  extended  constructs.  In  Figure  4-5,  the  "where"  clause  in  each  association 
link  specification  is  used  to  define  a  CARDINALITY  constraint  for  each  link.  Later  in 
this  dissertation,  we  will  discuss  how  this  constraint  type  was  also  defined  as  a  model 
extension.  In  a    CARDINALITY(min,max)  constraint,  it  is  stated  the  minimum  and 
maximum  number  of  instances  in  the  constituent  class  that  can  be  associated  with  an 
instance  in  the  defining  class.  This  constraint  can  be  used  to  specify  the  traditional 
cardinality  constraint  on  relationships,  as  shown  in  the  example.  A  CARDINALITY(1,1) 
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define  VendorPart  :  Entity  in  ManufSchema 
associations : 
public : 

Interaction 

{ 

{ 

vendor  :  Vendor  where  {  CARDINALITY (0, 10)  } 
part       :   Part      where  {  CARDINALITY { 1 , 1 )  } 

} 

Aggregation 
{ 

partCost  :  Real; 

} 

methods : 
public : 

method  print () ;  //  print  info  to  stdout 

rules : 

/ /  additional  semantics  not  supported  yet  by  model 

//  extensions 

rule  maxPartCost  is 

triggered  after  update (partCost) 

condition 

partCost  >  1000.00 
action 

abort_tx ( "Maximum  part  cost  is  $1000  \n"); 
end; 

end; 


Figure  4-5.  Specification  in  K.3  done  by  KB  A 

constraint  is  defined  in  the  link  "part"  (whose  constituent  class  is  Part)  to  state  that  a  Part 
instance  must  be  associated  with  one  and  only  one  VendorPart  instance,  i.e.  a  Part  can 
only  and  must  have  only  one  Vendor.  Similarly,  a  CARDINALITY(0,10)  is  defined  in 
the  link  "vendor"  to  state  that  a  Vendor  instance  can  be  associated  with  no  VendorPart 
instance  at  all  (i.e.,  a  Vendor  may  not  supply  any  part)  and  may  be  associated  with  up  to 
ten  VendorPart  instances  (i.e.,  a  Vendor  may  supply  up  to  10  parts). 

Also  m  Figure  4-5,  an  Aggregation  association  called  "partCost"  is  used  to  define  the 
"partCost"  attribute  of  the  relationship.  The  method  "print"  is  defined  for  the  class,  and 
a  rule  called  "maxPartCost"  defines  a  constraint  in  terms  of  the  maximum  value  of  the 
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"partCost"  attribute.  Notice  that  this  explicit  rule  was  defined  since  this  type  of  constraint 
is  not  a  modeling  construct  in  the  model  used  by  the  KBA.  However,  if  it  was  a  very 
frequently  used  constraint,  the  model  could  have  been  extended  to  include  a  MAXVALUE 
constraint  as  a  modeling  construct  to  capture  such  semantics.  The  complete  specification 
of  the  manufacturing  schema  is  given  in  Appendix  B. 
4.2.2  Association  Deflnition  Statement 

The  association  definition  statement  is  used  to  define  an  association  in  the  defining 
class.  Each  association  definition  statement  is  defined  in  the  "associations"  section  of  the 
class  definition.  The  syntax  of  the  association  definition  statement  is  as  follows: 

[  public  {  private  |  protected  :  ] 

<assoc_type_name>  [  <-  ] 
{ 

[  {  ] 

<link_name>  ;  <type_spec>  [  where  {  <expression_list>  }  ]  ; 
<link_name>  ... 

[  }  ] 

}     [  where  {  <expression_list>  }  ]  ; 

The  keywords  "public",  "private"  and  "protected"  are  information-hiding  qualifiers, 
used  to  specify  whether  a  property  is  public,  i.e.  is  visible  to  objects  of  other  classes, 
private,  i.e.  is  only  visible  to  objects  of  the  same  class,  or  protected,  i.e.  is  visible  to 
objects  of  the  same  class  and  from  its  subclasses. 

The  <assoc_type_name>  is  the  name  of  the  association  type.  This  name  should  match 
with  the  name  of  a  metaclass  that  represents  the  association  type,  i.e.  defined  as  a 
subclass  of  the  metaclass  Assoc.  The  second  curiy  braces  are  optional  and  are  used  to 
group  association  links  into  a  single  n-ary  association.  The  optional  arrow  (<-)  is  used 
to  define  inverse  association  links,  i.e.  the  class  being  defined  acts  as  the  constituent  class 
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rather  than  the  defining  class.  This  is  useful  for  defining  class  derivations  with 
Generalization  links  without  changing  (syntactically)  the  defining  class. 

The  <type_spec>  is  the  specification  of  the  constituent  class.  It  can  be  defined  with 
the  name  of  the  constituent  class  or  with  an  aggregate  class,  as  follows; 

<aggregate_class_name>  of  <class_name> 
where  <aggregate_class_name>  is  the  name  of  the  aggregate  class,  and  <class_name>  is 
the  name  of  the  classes  whose  objects  are  to  be  contained  in  the  aggregate.  For  example, 
"Set  of  Course"  specifies  the  constituent  as  a  set  of  objects  of  type  "Course". 

The  "where"  clause  inside  the  link  specification  clause  is  used  to  specify  a  set  of 
comma-delimited  expressions.  Such  expressions  will  update  the  attributes  of  the  instance 
in  the  metaclass  AssocLink  that  represents  the  link  in  question.  Similarly,  the  "where" 
clause  at  the  end  of  the  statement  is  used  to  specify  expressions  which  will  update  the 
corresponding  object  of  the  metaclass  Assoc.  This  concept  will  be  discussed  in  more 
detail  in  the  next  chapter. 

4.2.3  Method  Specification  Statement 

A  method  consists  of  two  parts,  (i)  the  specification  or  "signature"  of  the  method, 
which  defines  the  interface  part  (or  how  it  is  invoked)  and  (ii)  the  implementation. 
Method  signatures  are  specified  using  the  method  specification  statement.  Method 
implementation  is  discussed  together  with  control  statements  in  later  sections. 

Methods  are  specified  in  the  "methods"  section  of  the  class  specification  using  the 
following  syntax: 

[  public  i  private  |  protected  ] 

method  <method_name>  (  <parameterjist>  )  [  :  <retum_type>  ] 
[  where  {  <expression_list>  }  ]  ; 
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The  keywords  "public",  "private"  and  "protected"  are  information-hiding  qualifiers,  as 
described  previously.  The  <method_name>  is  an  identifier  which  uniquely  identifies  the 
method  within  the  class.  The  <parameter_list>  is  a  comma-delimited  list  of  parameter 
specifications,  each  of  which  is  defined  as  follows: 

<parameter_name>  :  <type_spec> 
where  <parameter_name>  is  the  name  of  the  parameter,  and  <type_spec>  is  the 
specification  of  the  parameter  type,  as  described  previously  for  associations.  The 
<retum_type>  of  the  method  is  optional,  and  is  assumed  to  be  "void"  of  omitted. 
Otherwise,  it  is  specified  in  the  same  way  as  <type_spec>  to  define  the  type  of  the  object 
returned  by  the  method. 

The  "where"  clause  inside  the  method  specification  is  used  to  specify  a  set  of 
comma-delimited  expressions  which  will  update  the  attributes  of  the  object  of  the 
metaclass  Method,  which  represents  the  method  in  question.  This  allows  the  support  of 
the  definition  for  additional  method  properties. 

4.2.4  Rule  specification  statement 

Both  explicit  rules  and  parameterized  rules  are  specified  in  the  "rules"  section  of  the 

class  specification.  Explicit  rules  are  specified  using  the  following  syntax: 

rule  <rule_name>  [  is  ] 
[  priority  <number>  ] 
triggered  <trigger_conditions> 
[  condition  <rule_condition>  ] 
[  action  <statements>  ] 
[  otherwise  <statements>  ] 

The  <rule_name>  is  an  identifier  which  uniquely  identifies  the  rule  within  the  class, 
rhe  priority  number,  <number>,  is  an  mteger  which  defines  relative  rule  pnonty,  where 
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higher  rule  priorities  are  assigned  with  higher  priority  numbers,  i.e.  rules  with  higher 
priorities  are  triggered  earlier  than  rules  with  lower  priority  numbers.  Priority  numbers  can 
be  negative.  If  the  priority  number  is  not  specified,  zero  is  assumed. 

The  <trigger_conditions>  is  a  list  of  trigger  conditions,  each  of  which  is  specified  as 
follows: 

[  before  |  after  |  on  commit  ]  <event_list> 
where  <event_list>  is  a  comma-delimited  list  of  events.  Each  event  is  either  the  a  method 
defined  in  the  class,  the  "update"  operation  or  a  system-defined  method  (defined  in  the 
"baseClass"  of  the  class  type)  such  as  "create",  "del"  or  "destroy". 

The  "condition"  clause  is  optional  Omitting  the  rule  condition  has  the  same  effect  as 
specifying  a  condition  which  is  always  TRUE.  The  <rule_condition>  in  the  condition 
clause  IS  a  predicate  or  a  guarded  expression.  A  guarded  expression  is  specified  as 
follows: 

(  CI,  C2  ..,  Cn-1  I  Cn  ) 
where  each  Ci  is  a  boolean  expression.  A  guarded  expression  is  evaluated  as  follows.  The 
first  n-1  expressions  are  evaluated  in  order  from  left  to  right.  If  any  of  the  expressions 
evaluates  to  FALSE,  evaluation  stops  at  that  moment  and  the  whole  rule  is  skipped.  If  all 
the  expressions  are  TRUE,  then  the  expression  Cn  is  evaluated,  returning  TRUE  if  Cn  is 
TRUE  (thus  following  the  "action"  clause),  or  FALSE  if  Cn  is  FALSE  (thus  following 
the  "otherwase"  clause).  Guarded  expressions  are  useful  when  multiple  conditions  need 
to  be  satisfied  before  another  condition  can  be  tested. 

The  "action"  clause  is  optional  except  in  either  of  the  following  two  cases:  (i)  no  rule 
condition  is  specified,  and  (ii)  a  rule  condition  is  specified  but  no  "otherwise"  clause.  A 
set  of  semicolon-delimited  statements  are  given  in  this  clause  to  specify  the  set  of  actions 
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that  are  to  be  performed  if  the  rule   condition  is  TRUE.  The  "otherwise"  clause  is 

optional.  If  given,  the  set  of  comma-delimited  <statements>  are  executed  if  the  rule 

condition  is  FALSE. 

Parametenzed  rules  are  defined  using  the  following  syntax: 

parain_rule  <rule_name> 
bind_classes  <expression_list> 
[  bind_if  <boolean_expression>  ] 
[  priority  <number>  ] 
triggered  <trigger_conditions> 
[  condition  <rule_condition>  ] 
[  action  <statements>  ] 
[  otherwise  <statement>  ] 

Notice  that  a  parameterized  rule  is  similar  in  structure  to  an  explicit  rule.  The  main 

differences  are  the  "bind_classes"  and  "bindjf  clauses,  and  the  type  of  expressions 

allowed  in  the  rule  body.  The  "bind  classes"  clause  is  used    to  specify  a  list  of 

expressions  which,  when  evaluated,  return  a  list  of  class  names  to  which  the  rule  will  be 

bound.  The  "bind  if '  clause,  when  given,  specifies  a  boolean  expression  that  must  be 

TRUE  m  order  for  the  rule  to  be  bound.  The  "priority",  "triggered",  "condition",  "action" 

and  "otherwise"    clauses  are  the  same  as  in  explicit  rules,  except  for  the  fact  that 

parameters  can  be  given  inside  these  clauses.  Such  parameters  are  bound   during  the 

binding  process.   This  will  become  more  clear  in  the  next  chapter  when  we  discuss 

parameterized  rules  in  more  detail. 

4.2.5  The  "program"  Statement 

The  "program"  statement  declares  a  new  "main"  program.  It  has  the  following  syntax: 

program  <program_name>  [  is  ] 
<statements> 

end; 
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where  <program_name>  is  a  unique  program  name.  This  name  is  given  to  the  run-time 
module  to  execute  the  program. 

4.2.6  The  "global"  Statement 

The  "global"  statement  declares  a  global  variable.  Global  variables  are  seen  by  all 
objects  m  the  schema.  A  global  variable  declaration  is  done  outside  of  any  class 
specification,  as  follows: 

global  <var_name>  :  <type_spec>; 
where  <var_name>  is  the  variable  name,  and  <type_spec>  is  the  type  specification,  as 
described  previously. 

4.3  Expressions 

Expressions,  when  evaluated,  change  the  state  of  one  or  more  objects  and,  if  the  return 
type  of  the  expression  is  not  void,  return  a  value  which  is  an  object.  We  categorize 
expressions  in  K.3  into  conventional  expressions,  expression  lists,  context  expressions  and 
parameterized-rule  expressions. 

4.3.1  Conventional  expressions 

Conventional  expressions  are  found  in  many  object-oriented  languages.  K.3  supports 
the  following  types  of  conventional  expressions: 

Simple  expressions.  Simple  expressions  include  constant  values  and  the  name  of  a 
variable  (global,  constant,  local,  an  association  link  or  a  method  parameter).  Using  the 
name  of  a  class  in  an  expression  returns  the  instance  of  the  metaclass  X  that  represents 
that  class,  where  X  is  the  class  type.  For  example,  the  expression  "Person"  returns  the 
instance  of  the  metaclass  "Entity"  if  "Person"  was  defined  as  "Entity". 
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Method  invocations.  A  method  invocation  is  expressed  as  follows: 
<method_name>  (  [  <parameter_list>  ]  ) 
where  <method_name>  is  the  name  of  the  method  to  be  invoked,  and  <parameter_list> 
is  a  comma-delimited  list  of  method  parameters.  A  method  invocation  causes  the  control 
of  the  program  to  go  to  the  first  statement  of  the  method.  Any  applicable  "before"  rules 
are  triggered  before  the  method  is  invoked.  Any  applicable  "after"  rules  are  triggered 
when  the  method  execution  finished.  Applicable  "on  commit"  rules  are  queued  for  later 
trigger  at  the  end  of  the  transaction  where  the  method  occurs. 

Operator  invocations.  Unary  operator  invocations  are  expressed  as  follows: 
<operator>  <expression> 
where  <operator>  is  the  symbol  of  the  operator  (e.g.,  +,     *,  etc.),  and  <expression>  is 
an  expression  which  returns  the  object  on  which  the  unary  operator  will  be  applied. 
Binary  operator  invocations  are  expressed  as  follows: 

<expressionl>  <operator>  <expression2> 
where  <operator>  is  the  operator  symbol,  and  <expressionl>  and  <expression2>  are 
expressions,  of  which  <expressionl>  returns  the  object  to  which  the  operator  will  be 
applied,  with  the  object  returned  by  <expression2>  being  the  parameter. 
Dotted  expressions.  A  dotted  expression  is  expressed  as  follows: 
<expressionl>.<expression2> 
where  <expressionl>  is  an  expression  which  returns  an  object  on  which  <expression2> 
is  evaluated.  <Expression2>  could  be  the  name  of  an  attribute  or  a  method  defined  in  the 
class  of  the  object  returned  by  <expressionl>.  Dotted  expressions  can  be  nested  in  any 
number  of  levels. 
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4.3.2  Expression  lists 

Expression  lists  are  a  convenience  when  multiple  expressions  need  to  be  evaluated 
on  a  single  object.  This  kind  of  expression  is  specified  as  follows: 

<expression>  {  <expressionl>,  <expression2>,      <expressionN>  } 
where  each  <expression_i>  is  evaluated  on  the  object  returned  by  <expression>.  Any  link 
or  method  name  specified  in  each  expression  in  the  list  is  assumed  to  be  defined  in  the 
class  to  which  the  object  returned  by  <expression>  belongs. 

4.3.3  Context  expressions 

A  context  expression  specifies  an  OQL  query  [Ala89]  and,  when  evaluated,  returns 

a  set  of  extensional  patterns  called  subdatabase.  An  extensional  pattern  contains  a  set  of 

references  to  instances  (or  oids)  interconnected  in  a    graph-like  fashion.  Context 

expressions  are  expressed  in  K.3  as  follows: 

context  <association_pattem> 
[  where  <condition>  ] 
[  select  <range_var_list>  ] 

The  <association_pattem>  is  an  expression  which  specifies  an  association  pattern  that 
must  be  satisfied  by  the  instances  to  be  included  in  the  subdatabase.  Some  examples  of 
association  patterns  are:  (i)  Part*Circuit,  which  returns  the  set  of  instances  of  the  classes 
Part  and  Circuit  which  are  associated,  (ii)  Part!  Capacitor,  which  returns  the  set  of 
instances  of  the  classes  Part  and  Circuit  which  are  not  associated,  (iii)  VendorPart  and 
{*Part,  *  Vendor}  ,  which  returns  the  set  of  instances  of  the  classes  VendorPart,  Part  and 
Vendor,  where  the  instances  of  VendorPart  are  connected  to  some  instances  of  both  Part 
and  Vendor,  and  (iv)  VendorPart  or  {*Part,* Vendor}  ,  which  returns  the  set  of  instances 
of  the  classes  VendorPart,  Part  and  Vendor,  where  the    instances  of  VendorPart  are 
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connected  to  some  instances  of  either  Part  or  Vendor.  The  "where"  clause  (similar  to 
SQL's  WHERE  clause)  specifies  a  condition  to  be  satisfied  by  the  instances  in  the 
subdatabase,  and  the  "select"  clause  performs  a  projection  over  a  set  of  classes.  For  more 
information  on  OQL,  the  reader  is  referred  to  [Ala89]. 

4.3.4  Parameterized-rule  expressions 

Expressions  used  in  parameterized  rules  are  specified  with  the  "@"  operator.  They  are 
used  to  specify  a  portion  of  the  rule  which  must  be  replaced  with  the  text  returned  when 
the  expression  is  evaluated  during  binding  .  A  conventional  expression  specified  after 
"@"  must  return  a  String  object.  There  are  also  some  predefined  functions  that  can  be 
used  to  do  more  advanced  bindings,  as  follows; 

@<operator>(<expression>).  This  expression  expects  that  <expression>  returns  a 
string  with  a  comma-delimited  list  of  identifiers.  The  result  after  binding  is  the  same  list 
with  <operator>  prefixed  in  each  item  m  the  list.  For  example,  @*("a,b,c")  evaluates  to 
"*a,*b,*c". 

@and(<expression>).  This  expression,  like  the  one  above,  expects  that  <expression> 
returns  a  string  with  a  comma-delimited  list  of  identifiers.  The  result  after  binding  is  an 
expression  which  has  every  item  in  the  list  conjuncted  by  "and".  For  example, 
@and("a,b,c")  evaluates  to  "a  and  b  and  c". 

@or(<expression>).  Similar  to  @and(<expression>),  but  "or"  is  used  instead. 

@foreach(x,<expressionl>,<expression2>).  This  expression  returns  a  string  which 
is  the  result  of  replacing  each  item  in  the  comma-delimited  list  in  <expressionl>  by 
<expression2>,  where  <expression2>  uses  "x",  and  x  is  bound  to    each  item  in 
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<expressionl>.  Example:  @foreach(x,"mom,dad","hello  @x")  returns  "hello  mom,  hello 
dad". 

4.4  Method  Implementation 

Method  implementation  is  syntactically  separated  from  the  specification.  It  is  defined 
using  the  following  syntax: 

method  <class_name>::<method_name>  (  [  <parameter_list>  ]  )  [  is  ] 

<statements> 
end  [  <method_name>  ]; 

The  <class_name>  uniquely  identifies  <method_name>  as  belonging  to  the  class.  The 
<parameterjist>  is  a  comma-delimited  list  of  parameters,  as  previously  discussed  for 
method  signatures.  A  list  of  semicolon-delimited  <statements>  forms  the  method 
implementation. 

4.5  Control  Statements 

Control  statements  in  K.3  are  categorized  into  branching,  looping,  control-flow  and 
blocking. 

4.5.1  Branching  Statements 

Branching  statements  alter  the  control  flow  based  on  a  boolean  condition.  The 

following  are  the  branching  statements  defined  in  K.3: 

The  "if-then-else"  statement:  The  "if-then-else"  statement  is  defined  as  follows: 

if  <condition>  then 
<statements  1  > 

[  else 

<statements2>  ] 

end  if 
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where  <condition>  is  an  expression  which  returns  an  object  of  type  Boolean.  If  the  value 
of  the  object  is  TRUE,  then  <statementsl>  are  followed.  If  the  value  is  FALSE  and  the 
"else"  clause  is  given,  then  <statements2>  are  followed. 

The  "case"  statement:  The  "case"  statement  is  a  short-hand  notation  for  nested 
"if-then-else"  statements.  It  allows  to  test  multiple  conditions  of  which  only  the  first  one 
that  returns  TRUE  causes  to  follow  its  corresponding  statements.  The  syntax  of  "case" 
is  as  follows: 

case 

when  <conditionl>  do 
<statementsl> 

[ 

when  <condition2>  do 
<statements2> 

when  <conditionN-l>  do 

<statementsN-l>  ] 
[  otherwise  do 

<statementsN>  ] 
end_case 

where  each  <condition_i>  is  a  boolean  expression.  Each  condition  is  evaluated  in  a 
top-down  fashion  until  one  of  them  evaluates  to  TRUE,  in  which  case  the  statements  that 
follow  "do"  are  executed.  If  none  of  the  expressions  is  TRUE,  and  the  "otherwise"  clause 
is  given,  then  <statementsN>  are  executed.  After  the  execution  of  any  of  the  list  of 
statements  following  "do",  control  flows  to  the  statement  that  follows  "end_case". 

4.5.2  Looping  Statements 

Looping  statements  execute  a  set  of  statements  repeatedly  until  a  state  is  reached  or 
the  control  flow  is  altered  by  any  of  the  control -flow  statements.  The  following  looping 
statements  are  supported  in  K.3; 
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The  "for"  statement:  The  "for"  statement  is  a  short-hand  notation  for  a  sequence  of 
statements  that  do  initialization  and  iterate  over  a  set  of  statements  until  a  condition  is 
satisfied,  changing  the  state  of  an  object  after  each  iteration.  The  syntax  of  "for"  is  as 
follows: 

for  <initiaiization>  until  <condition>  by  <change>  do 

<statements> 
end_for 

where  <initialization>  is  an  initialization  expression  that  is  evaluated  only  at  the 
beginning  of  the  execution  of  the  statement,  <condition>  is  a  boolean  expression  that, 
when  TRUE,  causes  the  statement  to  be  terminated,  and  <change>  is  an  expression 
evaluated  after  every  iteration.  On  each  iteration,  all  the  statements  in  <statements>  are 
executed. 

The  "while"  statement:  The  "while"  statement  repeatedly  executes  a  set  of 
statements  as  long  as  a  given  condition  is  TRUE.  The  syntax  of  this  statement  is  as 
follows: 

while  <condition>  do 

<statements> 
end_while 

where  <condition>  is  a  boolean  expression  which  needs  to  be  FALSE  to  exit  the 
statement,  and  <statements>  is  a  set  of  statements  which  are  executed  on  each  iteration. 

The  "do"  statement:  The  "do"  statement  iterates  over  a  subdatabase.  Subdatabases 
are  formed  using  context  expressions  On  each  iteration,  range  variables  defined  in  the 
association  pattern  are  assigned  to  a  new  reference  (oid)  in  the  subdatabase.  This 
statement  is  defined  as  follows: 
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<context_expression>  do 
<statements> 

end; 

where  <context_expression>  is  a  context  expression,  and  <statements>  are  the  statements 
to  be  executed  on  each  iteration. 

4.5.3  Control-Flow  Statements 

Control-flow  statements  alter  the  flow  of  the  control.  Two  control-flow  statements  are 
defined  in  K.3: 

The  "break"  statement  terminates  a  looping  statement  by  sending  control  to  the 
following  statement. 

The  "continue"  statement  which  forces  control  back  to  the  beginning  of  a  looping 
statement.  Any  condition  in  the  looping  statement  is  evaluated. 

4.5.4  Blocking  Statements 

Blocking  statements  are  used  to  group  a  set  of  statements.  The  following  blocking 
statements  are  defined  in  K.3: 

The  "local"  statement  is  used  to  declare  a  new  block  with  local  variables.  It  is  defined 
as  follows: 

local  <var_decl_Iist> 
begin 

<statements> 

end 

where  <var_decljist>  is  a  list  of  comma-delimited  variable  declarations,  as  follows: 

<var_name>  :  <type_spec> 
where  <var_name>  is  the  variable  name,  and  <type_spec>  is  the  variable  type 
specification,  as  described  previously. 
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The  "begin_trans-end_trans"  statement  defines  the  beginning  and  the  end  of  a 
transaction.  It  is  defined  as  follows: 

begin_trans 

<statements> 
end_trans 

Before  the  execution  of  the  first  statement  after  "begin_trans",  a  new  transaction  is  started. 
If  the  control  reaches  "end  trans",  the  transaction  is  committed.  Any  transaction  abort 
causes  the  control  to  flow  to  the  statement  following  "end_trans". 

4.5.5  Preprocessing 

K.3  uses  the  C  language  preprocessor  as  a  front-end.  Therefore,  all  C  preprocessor 
statements,  such  as  ^include  and  #define,  are  supported.  The  #define  statement  is  used 
for  defining  macros,  as  we  will  discuss  in  the  next  chapter. 


CHAPTER  5 
MODEL  AND  LANGUAGE  EXTENSIBILITY 

In  this  chapter,  we  present  in  more  detail  how  K.3  supports  model  extensibility,  and 
how  the  language  is  extensible.  We  start  by  giving  an  overview  of  our  approach.  We 
then  present  how  model  and  language  extensions  are  supported  by  a  set  of  mappings 
from  a  language  representation  to  a  knowledge  base  representation.  To  do  this,  we  will 
take  a  formal  approach  using  operational  semantics  [Seb96].  After  this,  we  present  the 
features  of  the  language  that  allow  it  to  support  model  extensions.  The  methodology  used 
by  a  KBC  to  do  model  extensions  is  also  presented.  Finally,  we  present  an  example  to 
show  how  a  KBC  would  make  a  model  extension  and  how  it  is  used  by  a  KBA. 

5.1  Overview 

By  providing  constructs  to  define  metaclasses  and  parameterized  rules,  K.3  allows  a 
KBC  to  define  model  extensions.  Model  extensions  are  defined  by  means  of  metaclasses 
and  their  parameterized  rules,  which  are  stored  m  the  knowledge  base  for  further  use  by 
the  K.3  compiler  during  the  compilation  of  schemas.  In  order  to  be  used  by  KBAs  and 
application  developers,  all  model  extensions  should  have  language  representations,  i.e. 
a  set  of  keywords.  K.3  is  extensible  in  the  sense  that  such  keywords  can  be  introduced 
without  having  to  change  the  traditional  components  of  the  compiler  (e.g.,  parser, 
semantic  checker,  code  generator,  etc.). 
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K.3  supports  extensibility  by  (i)  treating  all  the  names  of  metaclasses  as  keywords,  (ii) 
allowing  the  definition  of  metaexpressions,  which  represent  the  definitions  of  properties 
in  a  schema,  (lii)  evaluating  such  expressions  at  compilation  time  to  populate  the 
knowledge  base  with  objects  representing  the  definition,  (iv)  allowing  the  definition  of 
keywords  as  macros  which  are  expanded  into  metaexpressions,  and  (v)  binding 
parameterized  rules  to  the  applicable  classes  to  enforce  the  desired  semantics. 

Unlike  traditional  compilers  which  depend  on  statements  like  "include"  or  "export" 
to  do  semantic  checking,  K.3  maintains  in  the  knowledge  base  a  dictionary  which 
contains  all  the  compiled  specifications.  The  metaschema,  which  defines  all  the 
metaclasses  of  the  metamodel,  also  defines  the  structure  of  the  dictionary.  Therefore,  any 
model  extension  is  also  an  extension  of  the  K.3  dictionary.  Once  instances  are  created 
into  the  dictionary,  applicable  parameterized  rules  are  bound.  The  key  to  language 
extensibility  lies  on  (i)  how  the  language  is  designed  so  that  definitions  are  mapped  to 
dictionary  updates,  and  (ii)  how  the  compiler  effectively  makes  use  of  dictionary 
extensions.  This  is  achieved  by  a  series  of  language  to  knowledge-  base  mappings.  We 
will  present  formally  such  mappings  in  the  next  subsection. 

5.2  Language  to  Knowledge-Base  Representation  Mappings 

Every  definition  is  represented  in  the  knowledge  base  by  a  set  of  metaobjects,  i.e. 
objects  which  have  instances  in  metaclasses.  A  set  of  mapping  operations  perform  the 
mappings  from  the  language  representations  to  the  knowledge  base  representations.  These 
mappings  occur  during  the  compilation  of  schemas.  We  will  use  operational  semantics 
in  the  next  sections  to  illustrate  such  mappings.  In  operational  semantics,  a  set  of 
operations  are  used  to  describe  how  a  machine,  either  real  or  virtual,  executes  the 
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statements  of  a  program.  We  will  represent  such  operations  using  the  K.3  language,  which 
specify  what  really  happens  durmg  a  compilation,  i.e.,  a  set  of  K.3  statements  are 
executed  to  update  the  metadata  in  the  knowledge  base  (KB). 

5.2.1  Class-Definition  Mappings 

As  we  presented  previously,  the  "define"  statement  is  used  to  define  a  new  class,  The 
syntax  of  "define"  is  as  follows: 

define  <cname>  :  <ctype>  in  <sname> 
where: 

<expression_list> 

where  <cname>  is  the  class  name,  <ctype>  is  the  class  type,  and  <sname>  is  the  name 

of  the  schema  to  which  the  class  belongs.  The  <expressionJist>  in  the  "where"  clause  is 

a  list  of  semicolon-delimited  metaexpressions,  i.e.  expressions  that  create  and  update 

object  instances  in  the  metaschema.  Notice  that  class-type  extensibility  is  supported  by 

the  language  by  treating  as  a  keyword  the  name  <ctype>  which  is  actually  the  name  of 

the  metaclass  that  represents  the  class  type  (e.g.,  Entity,  Domain,  etc.) 

Given  the  above  grammar,  the  following  mapping  operations  are  performed: 

local  c:  Class; 
begin 

c  :=  <ctype>.pnew()  {  name  :=  "<cname>"  }; 
c.schema  :=  context  Schema  where  s.name="<sname>"; 
_KBMS.evalExpressionList(c,TREE(<expression_list>); 

end; 

where  "pnew"  is  the  system-defined  method  for  creating  a  new  persistent  instances.  The 
statement  <ctype>.pnew()  creates  a  new  persistent  instance  of  the  metaclass  <ctype>. 
This  instance  represents  the  definition  of  the  new  class  <cname>.  The  "context" 
expression  returns  the  instance  of  the  metaclass  Schema  which  represents  the  schema 
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<sname>.  The  reference  returned  by  this  expression  is  assigned  to  the  atrribute  "schema" 

of  the  metaclass  Class,  inherited  by  the  class  <ctype>.  The  global  variable  _KBMS 

represents  the  KBMS,  on  which  the  method  "evalExpression"  is  invoked  with  "c"  and 

TREE(<expression_list>)  as  parameters.  TREE(x)  is  the  syntax  tree  representation  of 

<expression_list>.    The  method  "evalExpressionlist"  will  evaluate  each  expression  in 

<expression_list>  on  the  instance  "c",  changing  the  state  of  "c"  accordingly. 

Example.  The  following  statement: 

define  Employee:  Entity  in  UnivSchema 
where: 

final  :=  TRUE; 

IS  mapped  to  a  KB  representation  with  the  following  operations: 

local  c:  Class; 
begin 

c  :=  Entity. pnew()  {  name  :=  "Employee"  }; 
c.schema  :=  context  Schema  where  s.name="UnivSchema"; 
_KBMS.evalExpressionList(c,TREE(final:=TRUE)); 

end; 

The  expression  "final:=TRUE",  when  evaluated,  sets  the  value  of  the  "final"  attribute 
m  the  instance  "c"  of  class  Entity.  We  assume  that  "final"  is  a  model  extension  defined 
as  a  direct  or  inherited  attribute  in  the  metaclass  Entity,  and  the  "where"  clause  is  used 
to  assign  a  value  to  it.  In  this  example,  'final"  is  used  to  define  this  class  as  a  final  class, 
i.e  no  subclasses  of  this  class  can  be  defined. 


5.2.2  Association-definition  mappings 

The  association  definition  statement  is  defined  as  follows: 

associations: 

<assoc_type> 
{ 

<link_name>  :  <type_spec>  [  where  {  <expression_list>  }  ] 
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Association-type  extensibility  is  supported  by  treating  as  a  keyword  the  name 

<assoc_type>,  which  is  the  name  of  a  metaclass  representing  an  association  type.  The 

following  mapping  operations  are  performed  for  this  statement: 

local  a:Assoc; 
begin 

a  ;=  Assoc$(<assoc_type>.pnew()); 
c.assocs.add(a); 

where  the  method  "pnew"  creates  a  new  persistent  instance  of  the  metaclass  Assoc,  which 
represents  the  association  just  defined.  The  operation  "AssocS"  performs  a  "class  casting" 
operation  (similar  to  the  "type  casting"  operation  of  C++)  of  the  object  to  the  class  Assoc. 
The  method  "add"  adds  the  association  instance  to  the  set  of  associations  of  instance  "c", 
created  by  the  "define"  statement.  The  attribute  "assocs"  in  class  Class  represents  the  set 
of  associations  of  a  class.  For  each  link  defined  in  the  statement,  the  following  mappings 
occur: 

a.links.add(AssocLink.pnew()  {  linkName  :=  "<link_name>", 

constClass  :=  (context  c:Class 

where    c.name="<type_spec")  }  ); 
_KBMS.evaIExpressionList(l,TREE(<expression_list>)); 

where  the  method  "add"  adds  a  new  AssocLink  instance  to  the  set  of  association  links 
of  association  "a".  A  new  AssocLink  instance  is  created  with  the  "pnew"  method, 
evaluating  the  expression  list  that  follows  to  set  the  values  of  linkName  and  constClass. 
The  method  "evalExpressionList"  evaluates  the  expression  in  the  "where"  clause,  as 
described  previously. 

Example.  The  following  statement: 

associations: 

Aggregation 
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{ 

name  :  String  where  {  unchangeable    =  TRUE  }• 

} 

is  mapped  to  the  knowledge  base  representation  by  the  following  operations: 

local  a:  Assoc; 
begin 

a  :=  Assoc$(Aggregation.pnew()); 
c.assocs.add(a); 

a.links.add(AssocLink.pnew()  {  linkName  :=  "name", 

constClass  .=  (  context  c:Class 

where  c.name=" String"  )  }  ); 
_KBMS.evalExpressionList(l,TREE(unchangeable  TRUE)); 


The  expression  "unchangeable  :=  TRUE"  sets  the  value  of  the  "unchangeable"  attribute, 
defined  in  metaclass  AssocLmk.  This  attribute  defines  a  model  extension  which  is  used 
to  define  attributes  as  unchangeable,  i.e.  their  values  cannot  be  changed  once  they  are 

set. 


5.2.3  Method-deflnition  mappings 

Methods  are  represented  in  the  KB  by  instances  of  the  metaclass  Method.  The 
method  definition  statement  has  the  following  syntax: 


method  <method_name>  (  <parameter_list>  )  [  :  <type_spec>  ] 
[  where  {  <expression_list>  }  ] 

The  following  mapping  operations  are  performed  for  this  statement: 

local  m:Method,  p:MethodParam; 
begin 

m  :=  Method.pnewO  {  signature  :=  SIGNATURE(<method_name>, 
<parameterjist>),  retValue  :=  (context  c:Class 

where  c.name="<type_spec>")}; 

c.meths.add(m); 
_KBMS.evalExpressionList(m,<expression_list>); 
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where  SIGNATURE  is  a  function  that  creates  a  unique  method  signature  from  the  method 
name  and  the  parameter  types  in  <parameter_list>  (e.g.,  "enroll(Course)"  for  method 
"enroll(c:Course)").  An  instance  of  the  Method  metaclass  is  created  using  "pnew",  then 
an  expression  list  is  used  to  set  the  values  of  "signature"  and  "retValue".  The  instance  "m" 
is  then  added  to  "meths",  the  set  of  methods  of  instance  "c"  (created  by  the  "define" 
statement).  The  method  "evalExpressionList"  evaluates  each  expression  in 
<expressionJist>  on  the  instance  "m",  changing  the  state  of  "m". 

For  each  parameter  in  <parameter_list>,  the  following  mapping  operations  are 
performed: 

p  :=  MethodParam.pnew  {  paramName  :=  "<pname>", 

paramType  :=  (context  c: Class 

where  c.name="<type_spec>")  }; 

m.params.add(p); 

where  <pname>  and  <type_spec>  are  the  parameter  name  and  type,  respectively.  Each 
parameter  "p"  is  added  to  the  list  of  method  parameters  "params". 

5.2.4  Rule-deflnition  mappings 

Explicit  rules  and  parameterized  rules  are  represented  in  the  KB  by  instances  of  the 

metaclasses  Rule  and  ParamRule,  respectively.  The  following  is  the  syntax  of  the 

explicit-rule  definition  statement: 

rule  <rule_name>  is 
triggered  <triggerjist> 
<rule_body> 
end; 

where  <rule_body>  are  the  "condition",  "action"  and  "otherwise"  clauses.  The  following 
mapping  operations  are  performed: 


58 

local  r  :  Rule,  t  :  RuleTrigger; 
begin 

r  :=  Rule.pnewQ  {  ruleld  :=  "<rule_name>", 

ruleBody  :=  STRING_FORM(<rule_body>)  }; 

c.ruls.add(r); 

where  "ruleld"  defined  in  the  metaclass  Rule,  and  STRING_FORM  is  a  function  that 
maps  the  <rule_body>  to  a  string  representation.  The  "add"  method  will  add  the  rule  "r" 
to  the  list  of  rules  ("ruls")  of  instance  "c",  which  represents  the  class  created  by  the 
"define"  statement. 

For  each  <trigger>  in  <trigger_list>  the  following  mappings  occur: 

t  :=RuleTrigger.pnew()  { trigger  time  :=  "<trig_time>",  trigger_op  :=  "<trig_op>", 

trigger_param  :=  "<trig_param>"  }; 

r.triggers.add(t); 

where  <trig_time>,  <trig_op>  and  <trig_param>  are  the  trigger  time,  operation  and 
parameter. 

Parameterized  rules  are  defined  as  follows: 

param  rule  <rule_name> 
bind_classes  <expression_list> 
[  bind_if  <boolean_expression>  ] 
triggered  <trigger_conditions> 
<rule_body> 

end; 

The  mapping  operations  are  similar  to  those  of  explicit  rules,  as  follows: 

local  r  :  ParamRule,  t  :  RuleTrigger; 
begin 

r  :=  Rule.pnewO  {  ruleld  :=  "<rule_name>", 

ruleBody  :=  STRING_FORM'(<rule_body>), 
bmdClasses  :=  STRING_FORM(<expression'_list>), 
bmdif  =  STRING_FORM(<boolean_expression>)  };' 

c.ruls.add(r); 
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Notice  that  the  <expression_list>,  <boolean_expression>  and  <rule_body>  are  stored 
in  textual  form  using  the  mapping  function  STRING_FORM.  This  text  is  be  used  later 
in  the  binding  process.  The  trigger  clauses  are  stored  in  the  same  way  as  in  explicit  rules. 

5.3  Language  Support  of  Extensibility 

In  this  section,  we  present  the  features  of  the  language  that  allow  it  to  support  model 
extensions  without  requiring  a  programming  effort  m  changing  the  compiler. 

5.3.1  Names  as  Keywords 

The  syntactic  structure  of  K.3  is  designed  to  support  the  modeling  constructs  of  the 
kernel  model,  with  built-in  keywords  such  as  "define",  "associations",  "method"  and 
"rule".  If  the  model  is  extended,  the  language  needs  to  be  extended  accordingly  such  that 
extensions  which  do  not  have  language  representations  can  be  used  in  schema  definitions. 
To  support  class-  and  association-type  extensibility,  K.3  treats  the  names  defined  in  the 
metaschema  as  keywords.  Names  of  metaclasses  that  represent  class  and  association  types 
can  be  used  directly  in  definitions.  Such  names  are  used  by  the  mapping  operations  to 
create  instances  in  the  metaschema,  which  will  be  used  in  the  parameterized  rule  binding 
process. 

5.3.2  Metaexpressions  and  Compilation-Time  Evaluation 

Some  model  extensions  are  defined  as  attributes  of  metaclasses.  Specifications  that  use 
such  extensions  need  to  update  the  value  of  such  attributes.  For  example,  methods  may 
have  an  additional  attribute  called  "documentation"  which  contains  a  textual 
documentation  of  the  method,  to  be  stored  in  the  KB.  Given  a  method  represented  by  the 
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instance  "m"  in  the  metaclass  Method,  the  value  of  the  "documentation"  attribute  can  be 
updated  with  the  following  expression  in  K.3: 

m. documentation  ;=  "This  method  computes  the  GPA  of  a  student" 
We  call  such  expression  that  updates  an  instance  of  a  metaclass  a  metaexpression. 
K.3  provides  a  "where"  clause  to  allow  the  definition  of  metaexpressions.  This  clause  is 
allowed  in  all  the  places  where  model  extensions  are  supported.  Currently,  it  is  allowed 
in  the  class  definition  statement,  in  the  "associations"  section,  in  the  association  link 
definition  statement,  and  in  the  method  definition  statement  to  define  additional 
properties  for  classes,  associations,  association  links  and  methods,  respectively. 
Depending  on  the  clause  where  a  metaexpression  is  used,  methods  used  in  the  expression 
will  be  invoked  on  different  types  of  instances:  Class,  Assoc,  AssocLink  or  Method 
depending  on  whether  the  metaexpression  is  defined  in  the  "where"  clause  of  the  class, 
association,  link  or  method  definition,  respectively. 

Another  example  that  illustrates  the  need  of  this  feature  is  the  definition  of  new  types 
of  constraints.  Constraints  are  not  part  of  the  kernel  model,  therefore  there  is  no 
"constraints"  section  in  the  class  definition  statement.  Let  us  assume  that  the  KBC  has 
defined  the  metaclass  Constraint  for  this  purpose,  and  that  any  new  type  of  constraint  is 
defined  by  defining  subclasses  of  this  class.  Any  constraint  defined  in  a  class  definition 
should  have  representative  instances  in  these  metaclasses. 

One  example  of  the  use  of  metaexpressions  is  the  definition  of  the 
CARDINALITY(min,max)  constraint  (described  briefly  in  Chapter  4)  for  an  association 
link,  as  follows: 

associations: 

Interaction 
{ 
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vendor  :  Vendor  where  {  Cardinality, pnewQ  {  min:=0,  max:=10, 

constraintDefLinks.add(this)} ; 

This  expression  creates  a  new  instance  of  the  metaclass  Cardinality  (subclass  of 
Constraint)  by  invoking  the  method  "pnew",  assigns  the  values  0  and  10  to  the  attributes 
min  and  max,  respectively,  and  adds  "this"  (instance  of  AssocLink  implicitly  created  by 
the  definition  statement)  to  the  set  of  constraint  defining  links  of  the  constraint,  i.e.  the 
set  of  links  to  which  the  constraint  applies. 

5.3.3  Macros 

Notice  that  metaexpressions  can  look  messy.  Ideally,  the  KBA  should  not  be 
concerned  about  metaexpressions  at  all,  as  they  are  just  a  way  to  support  extensibility 
without  changing  the  implementation  of  the  compiler.  Macros  are  a  way  to  define 
metaexpressions  as  keywords  used  by  the  KBA.  For  example,  the  following  macro  defines 
the  keyword  CARDINALITY  to  be  used  by  the  KBA  to  define  the 
CARDINALITY(min  ,max)  constraint: 

^define  CARDINALITY(m,n)  Cardinality.pnew()  {  min:=m,  max:=n,  \ 

constraintDefLinks.  add(this) } 

Instead  of  using  a  metaexpression,  the  KBA  can  simply  use  the  keyword  (macro) 

CARDINALITY  in  the  definition  of  a  such  a  constraint,  as  we  have  seen  in  previous 

examples.  Macros  are  expanded  at  compilation  time  by  preprocessing.    Notice  that  our 

previous  example  looks  neater  now: 

associations: 

Interaction 
{ 

vendor  :  Vendor  where  {  CARDINALITY(0,10)  }; 
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5.3.4  Parameterized  rule  binding 

For  each  instance  of  a  metaclass  (e.g.,  an  association)  created  during  compilation  time, 
the  K.3  compiler  will  bind  the  parameterized  rules  defined  in  that  metaclass  to  the 
classes  specified  in  the  "bind_classes"  clause  of  the  parameterized  rules.  In  this  way,  the 
semantics  defined  by  the  KBC  are  bound  to  the  appropriate  classes  (e.g.,  the  constituent 
classes  m  an  association). 

Parameterized  rules  are  stored  in  the  KB  for  later  retrieval  and  binding  by  the  K.3 
compiler.  This  is  done  by  the  following  algorithm: 

foreach  (i  :  instance  of  a  metaclass  created  at  current  compilation)  do 
foreach  (p  :  parameterized  rules  defined  for  i)  do 
foreach  (c  :  bind  classes  defined  in  p)  do 

if  (expression  in  bind  if  evaluates  to  TRUE) 
bind  p  to  c  evaluating  @  expressions 
Expressions  specified  by  the  "@"  operator  rule  define  the  rule  parameters.  Names  used 
in  such  expressions  must  be  defined  in  the  metaclass  for  which  the  parameterized  rules 
are  defined.  Once  rules  are  bound  to  the  applicable  classes,  they  are  compiled  as  if  they 
were  defined  explicitly  by  the  user.  An  example  of  an  explicit  rule  is  the  one  resulting 
from  binding  the  parameterized  rule  "referential",  defined  in  the  metaclass  Interaction, 
to  the  class  VendorPart,  as  it  was  presented  in  Figure  4-5: 

rule  VendorPart: :referential02987 

triggered  on_coniniit  create(),  update(vendor,part) 

condition 

EXIST(context  this  and  {* Vendor, *Part}) 
otherwise 

abort_tx("REFERENTIAL  constraint  violated  m  class  " 
+"  VendorPart"), 


end; 
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Notice  that  the  parameters  defined  with  the  "@"  operator  have  been  substituted  with 
text  specific  to  the  class  VendorPart.  The  parameter  @linkNames()  has  been  replaced 
with  the  names  of  the  association  links  defined  in  the  association.  The  parameter 
@*constituentCla$sNames()  has  been  substituted  with  a  list  of  constituent  class  names, 
prepending  a  star  (*)  symbol  to  each  class  name,  to  form  a  valid  context  expression. 
Similarly,  the  parameter  @definingClassName()  was  substituted  with  the  name  of  the 
defining  class  of  the  association.  The  methods  invoked  in  these  parameters  were  defined 
by  the  KBC  in  the  metaschema,  especifically  in  the  class  Assoc. 


5.4  Methodology 

In  our  procedural  approach  for  representing  semantics,  the  KBC  needs  to  be  aware  of 
the  following:  (i)  the  mappings  from  language  representations  to  KB  representations,  (ii) 
the  parts  of  the  kernel  model  that  are  extensible,  and  (iii)  the  process  of  binding 
parameterized  rules.  With  this  in  mind,  given  a  desired  model  extension,  a  KBC  defines 
the  following: 

1 )  A  set  of  metaclasses 

2)  A  set  of  associations  with  metaclasses  already  in  the  kernel  model,  such  as 

specializations 

3)  A  set  of  attributes 

4)  A  set  of  supporting  methods 

5)  A  set  of  parameterized  rules 
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Each  metaclass  defined  by  the  KBC  represents  a  specification  done  by  the  KBA.  Such 
a  metaclass  is  associated  with  other  metaclasses,  such  as  a  derivation  from  an  existing 
metaclass  (e.g.,  "Interaction"  as  a  new  association  type,  defined  as  a  subclass  of  "Assoc"), 
or  a  modification  of  the  structure  of  a  metaclass  (e.g.,  "Set  of  Constraint"  defines  an 
attribute  of  the  metaclass  "Class"). 

Some  attributes  may  be  needed  to  describe  model  extensions.  For  example,  a  MAX(n) 
class  constraint  is  described  by  the  maximum  number  of  objects  allowed  in  a  class,  e.g., 
the  "maxobjs"  attribute.  The  KBC  should  identify  such  attributes  and  define  them  in  the 
corresponding  metaclass. 

Supporting  methods  may  be  needed  by  the  KBA  to  define  binding  classes  in 
parameterized  rules,  or  to  used  them  in  metaexpressions.  Such  methods  may  modify  the 
state  of  the  KB  or  perform  complex  queries  to  obtain  a  value  used  in  binding.  An 
example  of  the  former  case  is  to  define  an  index  on  a  class  attribute.  Such  method  may 
be  defined  in  the  metaclass  Entity,  and  its  implementation  would  create  an  instance  of  the 
metaclass  Index  and  associate  it  with  instances  of  the  classes  Entity  and  Assoc.  An 
example  of  the  later  case  is  the  method  "constituentClasses"  which  returns  a  comma- 
delimited  list  of  constituent  class  names  given  an  association.  Such  method  may  need  to 
query  the  KB  to  retrieve  the  AssocLink  instances  associated  with  the  given  Assoc 
instance,  retrieve  Class  instances  associated  with  the  AssocLink  instances,  then  construct 
the  list  from  these  class  names. 

Parameterized  rules,  as  shown  before,  represent  the  general  semantics  of  model 
extensions.  The  KBC  should  identify  which  attributes  of  the  model  are  "variable"  and 
define  them  as  parameters  in  parameterized  rules.  To  do  this,  techniques  such  as  pattern 
identification  may  be  useful. 
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In  general,  the  process  of  doing  model  extensions  using  procedural  semantics  requires 
ingenuity  of  the  KBC  as  well  as  knowledge  of  the  binding  process.  What  makes  our 
approach  different  from  others  is  the  flexibility  we  provide  to  the  KBC  to  do  model 
extensions  using  a  high-level  specification  language.  In  the  absence  of  it,  the  KBC  would 
be  required  to  know  the  details  of  the  system  implementation  such  as  code  generation, 
internal  representation  and  parsing.  This  is  how  model  extensions  are  done  in  monolithic 
DBMSs. 

5.5  An  Example 

So  far,  we  have  presented  the  features  of  K.3  that  allow  it  to  support  model 
extensibility.  We  intend  in  this  section  to  illustrate  by  means  of  one  example  how  do 
the  KBC  and  KBA  use  these  features.  Our  example  consists  of  the  definition  of  the 
CARDINALITY(min,max)  constraint. 

Let  us  assume  that  the  KBC  starts  with  the  kernel  model  in  which  constraints  are  not 

supported.  The  first  thing  that  the  KBC  needs  to  do  is  to  extend  the  model  to  support 

constraints  in  general.  The  KBC  may  decide  that  a  KBA  may  define  constraints  on  a 

class  (e.g.,  maximum  objects  in  a  class),  on  an  association  link  (e.g.,  cardinality)  or  a  set 

of  association  links  (e.g.,  composite  key).  Based  on  this  criteria,  he/she  defines  the 

metaclass  Constraint  using  K.3  as  follows: 

define  Constraint  :  MetaClass 
associations: 

Aggregation 
{ 

constraintDefClass  Class; 
constraintDefLinks  :  Set  of  AssocLink; 

methods: 
public: 
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method  className()  :  String; 
method  linkNames()  :  String; 
method  definingClassNames()  :  String; 
method  constClassNames()  .  String; 

end; 

The  methods  className()  and  linkNames()  return  the  name  of  the  class  and  links  for 
which  the  constraint  is  defined,  respectively.  The  methods  definingClassNames()  and 
constClassNamesO  return  the  names  of  the  defining  and  constituent  classes  defined  by 
the  association  links  of  the  constraint,  respectively.  They  can  be  used  to  define 
parameters  for  parameterized  rules,  as  we  shall  see  later. 

With  the  metaclass  Constraint  defined,  the  model  is  ready  to  be  extended  with  new 
constraint  types,  created  by  defining  subclasses  of  this  metaclass.  One  example  of  a 
constraint  type  is  the  CARDINALITY(min,max)  constraint  discussed  previously.  A 
CARDINALITY(min,max)  constraint  can  be  formally  defined  as  follows: 

CARDINALITY(MIN,MAX):  Let  D  and  C  be  the  defining  and  constituent 
classes  in  an  association.  Let  I  be  an  instance  in  C.  Then  I  must  be  associated  with  at 
least  MIN  instances  in  D,  and  cannot  be  associated  with  more  than  MAX  instances  in 
D. 

The  KBC  will  define  this  new  type  of  constraint  by  defining  a  new  metaclass  called 
Cardinality,  subclass  of  the  metaclass  Constraint.  It  is  known  that  this  constraint  has  two 
attributes,  namely  "mm"  and  "max,  which  hold  the  values  of  the  MIN  and  MAX 
parameters,  respectively.  The  following  is  the  definition  of  the  metaclass  Cardinality  in 
K.3: 

define  Cardinality  :  MetaClass 
associations: 
public: 

SPECIALIZATION  {  Constraint  }; 
Aggregation 
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{ 

min  :  Integer; 
max  :  Integer; 

} 

end; 

The  attributes  of  Cardinality  should  be  updated  when  a  new  constraint  of  this  type  is 
defined  on  an  association  link,  i.e.,  when  a  new  instance  of  Cardinality  is  created  at 
compilation  time.  The  following  expression  should  be  used  at  compilation  time  for  this 
purpose: 

Cardinality.pnewO  {  min:=MIN,  max:=  MAX,  constraintDefLinks.add(this)  } 

which  creates  a  new  instance  of  the  Cardinality  metaclass  and  updates  the  values  of  its 

attributes  "min",  "max"  and  "constraintDefLinks"  (this  last  one  inherited  from  the 

metaclass  Constraint).  Since  this  constraint  is  defined  for  association  links,  "this" 

represents  the  receiver  of  the  message  at  compilation-time  evaluation,  which  is  the 

instance  of  AssocLink  created  by  the  definition  of  a  new  association  link. 

The  above  metaexpression  looks  messy  and  presents  details  not  relevant  to  the  KBA. 

To  avoid  this  and  to  present  the  constraint  definition  in  a  higher  level,  the  KBC  defines 

the  macro  CARDINALITY(MIN,MAX)  as  follows: 

^define  CARDINALITY(MIN,MAX)  \ 

Cardinality.pnewO  {  min:=MIN,  max:=  MAX,  \ 

constraintDefLinks. add(this)  } 

It  is  part  of  the  KBC's  job  to  define  formally  the  semantics  of  the  constraint.  Once 

this  is  done,  it  is  easier  to  do  the  next  task,  which  is  to  identify  the  operations  (events) 

that  may  violate  the  constraint.  For  this  example,  the  KBC   finds  that  the  following 

operations  may  violate  the  constraint: 

1)  Update  the  value  of  an  association  link  (associate  two  instances) 

2)  Create  a  new  instance  in  the  constituent  class 
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3)  Delete  an  instance  in  the  constituent  class 
The  next  step  is  to  identify  which  conditions  will  violate  the  constraint  or  which  ones 
must  hold  so  that  the  constraint  is  not  violated.  The  KBC  finds  that  the  following 
conditions  must  hold: 

1)  Defining  class:  The  number  of  instances  of  this  class  associated  with  the 
associated  instance  in  the  constituent  class  should  be  greater  than  or  equal  to  MIN  and 
less  than  or  equal  to  MAX. 

2)  Constituent  class:  The  number  of  instances  in  the  defining  class  associated  with 
this  instance  should  be  greater  than  or  equal  to  MIN  and  less  than  or  equal  to  MAX.  If 
the  instance  is  going  to  be  deleted,  then  the  number  of  instances  in  the  defining  class 
associated  with  this  instance  should  be  greater  than  MIN. 

This  constraint  can  be  enforced  with  three  rules,  one  in  the  defining  class  and  two 
in  the  constituent  class.  The  following  parameterized  rules  are  defined  in  the  metaclass 
Cardinality: 

/*  rule  to  be  bound  to  the  constituent  class  */ 
parain_rule  Cardinality: :partl 
bind_classes  @constClassNames() 
triggered  after  create() 
condition 

COUNT(context  this    @definingClassNames())  >=  @min 
and  COUNT(context  this  *  @definingClassNames())  <=  @max 
otherwise 

abort_tx(" CARDINALITY  constraint  violated  in  class  "+ 
"@constClassNames()"); 

end; 

/*  rule  to  be  bound  to  the  constituent  class  */ 
parani_rule  Cardinality:  :part2 
bind_ciasses  @constClassNames() 
triggered  before  del() 
condition 

COUNT(context  this  *  @definmgClassNames())  >  @min 
otherwise 
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abort_tx("CARDINALITY  constraint  violated  in  class  "+ 
"@constClassNames()"); 

end; 

/*  rule  to  be  bound  to  the  defining  class  */ 
parain_rule  Cardinality  . part2 
bind_classes  @definingClassNames() 
triggered  after  update(@linkNames()) 
condition 

COUNT(context  this  *  @constClassNanies())  >=  @min  and 
COUNT(context  this  *  @constClassNames())  <=  @max 
otherwise 

abort_tx("CARDINALITY  constraint  violated  in  class  "+ 
"@definingClassNames()"); 

end; 

The  COUNT  macro  is  used  to  count  the  number  of  instances  returned  by  the  "context" 
expression.  A  "context"  expression  in  K.3  defines  a  query  in  the  form  of  an  association 
pattern.  An  association  pattern  specifies  a  pattern  which  must  be  matched  by  the 
connected  instances  in  the  database.  This  will  become  clearer  when  we  present  the 
explicit  rules  generated  for  this  example. 

The  KBC  compiles  these  specifications  using  the  K.3  compiler.  The  definitions  then 
become  instances  of  metaclasses,  and  the  rules  are  stored  in  the  KB  for  later  retrieval  by 
the  compiler  during  binding.  Once  this  is  done,  the  extended  model  is  ready  to  be  used 
by  the  KBA  in  his/her  specifications,  as  illustrated  by  the  example  already  given  in 
Figure  4-5,  where  the  CARDINALITY(min,max)  constraint  is  used.  For  this  example, 
the  following  explicit  rules  will  be  generated  for  the  class  Vendor: 

rule  Vendor: :partl  189278 
triggered  after  create() 
condition 

COUNT(context  this  *  VendorPart)  >=  0  and 
COUNT(context  this  *  VendorPart  10 
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otherwise 

abort_tx("CARDINALITY  constraint  violated  in  class  "+ 
"VendorPart"); 

end; 

The  CARDINALITY(min,max)  constraint  is  enforced  by  counting  the  number  of 
VendorPart  instances  associated  to  "this"  instance  (for  which  the  rule  is  applied).  The 
"context"  expression  is  used  for  this  purpose  by  providing  an  association  pattern  which 
must  be  satisfied  by  the  instances  in  the  database.  In  this  case,  the  "*"  operator  is  used 
in  the  "context"  expression  to  return  the  instances  of  the  class  VendorPart  that  are 
associated  (or  "connected")  to  "this"  Vendor  instance.  The  constraint  will  be  violated  if 
the  expression  given  in  the  "condition"  part  of  the  rule  evaluates  to  FALSE,  in  which 
case  the  "otherwise"  clause  of  the  rule  is  followed,  thus  causing  the  abortion  of  the 
transaction. 

Similarly,  the  following  rules  will  be  bound  to  the  class  VendorPart: 

rule  VendorPart: :part2 1298 
triggered  after  update(vendor) 
condition 

COUNT(context  this  *  Vendor)  >=  0  and 
COUNT(context  this  *  Vendor)  <=  10 
otherwise 

abort_tx("CARDINALITY  constraint  violated  in  class  "+ 
"VendorPart"); 

end; 

rule  VendorPart:  :part2 1299 
triggered  after  update(part) 
condition 

COUNT(context  this  *  Part )  >=  1  and 
COUNT(context  this  *  Part)  <=  1 
otherwise 

abort_tx(" CARDINALITY  constraint  violated  in  class  "+ 
"VendorPart"); 

end; 


CHAPTER  6 
SYSTEM  IMPLEMENTATION 
This  chapter  presents  the  implementation  of  K.3.  We  start  with  an  overview  of  the 
implementation  approach,  followed  by  a  discussion  of  the  system  architecture.  The 
mappings  from  class  specification  and  implementation  to  C++  are  also  presented.  The 
compilation  process  will  be  also  discussed.  We  conclude  this  chapter  with  some 
interesting  problems  found  in  the  implementation. 

6.1  Overview 

The  compiler  of  K.3  is  implemented  in  C++  and  runs  in  the  Sun4/Solaris  platform. 
After  the  kernel  model  was  defined  and  kernel  metaclasses  were  implemented  in  C++, 
parts  of  the  system  (e.g.,  the  Dictionary)  were  implemented  in  K.3  itself  The 
metaschema  is  fully  specified  in  K, 3,  and  a  majority  of  the  methods  of  metaclasses  are 
implemented  in  K.3. 

The  compiler  of  K.3  generates  C++  code  which  is  compiled  with  the  GNU  g++ 
compiler.  For  every  K  class,  an  equivalent  C++  class  is  generated.  C++  templates  are 
used  to  represent  properties  such  as  persistence,  aggregate  types  and  ownership.  Control 
structures  in  K.3  are  mapped  directly  to  their  counterparts  in  C++.  More  details  on  these 
mappings  will  be  presented  later  in  this  chapter. 
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When  the  model  is  extended,  K.3  specifications  are  compiled  and  their  corresponding 
C++  implementations  generated  and  compiled  by  g++.  The  resulting  object  code  is 
relinked  with  the  existing  K.3  components,  thus  forming  a  new  K.3  compiler  which 
recognizes  model  extensions.  This  process  is  known  as  bootstrapping.  It  should  be  noted 
that  although  a  new  compiler  results  from  bootstrapping,  the  creation  of  a  new  compiler 
that  recognizes  the  model  extensions  only  involves  a  simple  linking  of  its  components. 

6.2  System  Architecture 

The  system  architecture  of  K.3  is  shown  in  Figure  6-1.  System  components  are 
represented  by  rectangles  and  their  usage  relationships  are  represented  using  links. 
Components  were  defined  using  either  C++  or  K.3  classes.  The  following  is  a  brief 
description  of  each  component: 

Preprocessor.  The  Preprocessor  module  does  the  preprocessing  of  input  text. 
Currently,  the  C  preprocessor  is  used. 

Compiler.  This  module  is  the  main  interface  to  the  system.  It  makes  use  of  the 
Preprocessor,  Parser,  SemCheck  and  IntCodeGen  module. 

Parser.  This  module  does  lexical  analysis  and  parsing  of  the  input  text.  It  generates 
an  abstract  representation  called  the  KAST  (K  Abstract  Syntax  Tree).  This  tree 
representation  is  used  by  the  rest  of  the  system  components.  This  module  was 
implemented  using  LEX  and  YACC. 

SemCheck.  The  SemCheck  module  does  semantic  checking,  i.e.,  it  checks  for  the 
semantic  correctness  of  input  K.3  programs.  It  makes  use  of  the  SymTabHandler  to  check 
for  the  existence  of  symbols  and  their  ty  pes. 
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IntCodeGen  i  I  PRuleBinder 


Figure  6-1.  K. 3  System  Architecture 


SymTabHandler.  This  is  the  symbol  table  handler  module,  responsible  for 
manipulating  symbol  tables.  Symbol  tables  are  used  to  store  symbol  information  such  as 
class,  association,  method,  rule  and  variable  names,  and  their  respective  types.  This 
module  builds  a  symbol  table  from  the  KAST 
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IntCodeGen.  This  module  generates  C++  (or  "intermediate")  code.  It  interfaces  with 
the  KBMS  for  services  such  as  storage  and  retrieval  of  metadata,  and  dynamic  expression 
evaluation  (DEE). 

CodeGen.  The  CodeGen  module  generates  executable  machine  code.  It  uses  the  g++ 
compiler  for  this  purpose,  as  well  as  other  operating  system  programs  such  as  "make". 

PRuleBinder.  This  module  does  parameterized  rule  binding.  It  uses  the  KBMS  for 
services  such  as  storage  and  retrieval  of  metadata,  and  DEE  [Kun94]. 

KBMS.  The  KBMS  module  is  the  main  interface  to  the  KBMS.  It  uses  metaclasses 
such  as  Dictionary,  Class,  MetaClass,  Assoc,  AssocLink,  Method  and  Rule,  as  well  as 
other  system  components  such  as  ObjectMgr  (the  Object  Manager),  Query Proc  (Query 
Processor)  and  a  Distributed  Storage  Manager  (DSM)  [Vis96]. 

6.3  Mappings  to  C++ 

The  compiler  of  K.3  generates  C++  code.  In  this  section,  we  will  present  the 
mappings  from  K.3  specification  and  implementation  to  C++. 

6.3.1  Mapping  of  Specifications 

For  every  K  class,  an  equivalent  C++  class  is  generated.  To  avoid  any  conflict  with 
other  components  not  implemented  in  K.3,  the  compiler  generates  a  C++  class  name  as 
follows: 

_KCLASS_<k3classname> 
where  <k3classname>  is  the  name  of  the  K.3  class.  Each  equivalent  C++  class  is  defined 
as  a  subclass  of  the  predefined  C++  class  KObject. 
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Associations  are  mapped  to  C++  data  members.  This  includes  Generalization 
associations.  Notice  that  Generalization  associations  are  not  mapped  to  subclass 
derivations  using  "public",  as  the  C++  storage  model  is  static  versus  the  dynamic  storage 
model  of  K.3,  as  discussed  by  Yaseen  [Yas91],  Arroyo  [Arr92]  and  Block  [Blo96]. 

C++  templates  are  used  to  represent  properties  such  as  persistence,  aggregate  types 
and  ownership.  For  example,  the  template  _KdbObject<T>  defines  class  T  as  a  persistent 
class.  Similarly,  the  template  _Kdomain<T>  defines  class  T  as  a  "domain"  (attribute)  of 
another  class.  The  template  Set<T>  defines  a  set  of  objects  of  type  T.  A  more  detailed 
description  of  the  approach  using  templates  is  described  by  Block  [Blo96]. 

For  every  K.3  method  X,  three  C++  methods  are  defined:  (i)  KBEGIN  X,  whose 
implementation  executes  any  rules  that  are  triggered  "before"  the  method,  or  "after"  the 
method,  and  (lii)  X  itself,  which  invokes  _KBEGIN_X  first,  performs  the  method 
implementation  defined  by  the  user,  and  then  invokes  _KEND_X.  For  every  rule  R,  a 
C++  method  _KRULE_R  is  generated  containing  the  implementation  of  the  rule  body 
(condition,  action,  otherwise  clauses).  When  R  is  triggered,  the  method  _KRULE_R  is 
invoked  and  executed.  Figure  6-2  presents  an  example  of  a  K.3  specification,  and  Figure 
6-3  shows  the  corresponding  generated  C++  code. 

6.3.2  Mapping  of  Implementation 

Most  of  the  K.3  control  structures  are  similar  to  those  of  C++.  The  mappings  for  such 
constructs  is  straightforward.  We  will  concentrate  on  the  mappings  that  posed  a  challenge 
in  the  implementation. 

Operations  on  variables  declared  over  Domain  classes  do  not  pose  any  problem,  as 
references  to  Domain  class  objects  are  similar  to  those  of  C++  classes.   References  to 
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define  Part   :  Entity  in  ManufSchema  is 
associations : 
p\iblic : 

Aggregation  -> 

{ 

part_no   :  String; 
description  :  Text; 
avg_cost  :  Dollar; 
qty_on_hand  :  Integer; 
part  value   :  Real; 
no_oT_pins   :  Integer; 

}  / 

Generalization  -> 

{  Board  ; IntegratedCircuit;  Resistor;  Capacitor  }; 

methods : 
public : 

method  displayO; 
method  type (); String  ; 

rules : 

rule  no_null_part_no  is 
triggered  on_coninit  create ( ) 

after  update (part  no) 
condition  part  no  =  "" 
action  ~ 

"RULE:   Part::no  null_part_no\n" . display () ; 
"*ERROR*  Part  sEould  always  have  a  non-empty  "+ 

"part  no. \n" .display 0 ; 
del  ( ) ;  - 
end; 

rule  display  deletion  is 
triggered  before  del ( ) 
action 

part_no. fdisplayl "Part  %s  deleted\n"); 

end; 

end  Part; 


Figure  6-2.  Example  of  specification  defined  in  K.3 

entity  class  objects,  however,  are  object  identifiers  (oid)  which  need  to  be  mapped  to  a 
C++  virtual  memory  address  before  any  operation  can  be  performed  on  the  referred 
object.  Variables  declared  over  entity  classes  are  declared  over  the  template 
EReference<T>,  where  T  is  the  name  of  the  entity  class.  This  template  has  defined  the 
method  "load",  which  loads  from  the  KB  the  referred  instance,  and  returns  the  virtual 
memory  address. 
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idefine  _KOFS_Part_part_no  sizeof {_KObject) 

#define  _KOFS_Part_description  _KOFS_Part_part_no  +  sizeof (_KDEF  String) 
Idefine  _KOFS_Part_avg_cost  _KOFS_Part_description  +  sizeofT_KDEF  String) 
#define  _KOFS_Part_qty_on_hand  _KOFS_Part_avg_cost  +  sizeof  (  KDEF_Real) 
idefine  _KOFS_Part_part  value  _KOFS_Part_qty_on_hand  +  sizeoT(_KDEF_Integer ) 
Idefine  _KOFS_Part_no_oT  pins  _KOFS_Part_part_value  +  sizeof (  KDEF  Real) 
class  _KCLASS_Part  :  public  _KObject  {  ~  ~ 

KCOMPLEXMEMBERS (_KCLASS_Part )  ; 
public : 

domain2<_KDEF_String,_K0FS_Part_part_no>  part  no; 

domain2<_KDEF_String, _KOFS_Part_description>  Hescription; 

domain2<_KDEF_Real, _KOFS_Part_avg_cost>  avg  cost; 

domain2<_KDEF_Integer,_K0FS_Part_qty_on_hana>  qty  on_hand; 

doinain2<_KDEF_Real,_K0FS_Part_part_value>  part_vaTue; 

doinain2<_KDEF_Integer,_K0FS_Part_no_of_pins>  no_of_pins; 
void  create ( ) ; 
void  _KBEGIN_create ( ) ; 
void    KEND_create ( ) ; 
void  3el ( ) ; 
void  _KBEGIN_del ( ) ; 
void    KEND_del ( ) ; 
void  destroy ( ) ; 
void  _KBEGIN_destroy ( ) ; 
void     KEND_destroy ( ) ; 
void  display ( ) ; 
void  _KBEGIN  displayO; 
void  _KEND_dTsplay ( )  ; 
_KDEF_String  type ( ) ; 
void  _KBEGIN_type ( ) ; 
void  _KEND  type ( ) ; 

void  no_nuTl_part_no (const  void  *that) ; 

static  int  _KRULE_no_null_part_no ( const  void  *that) ; 

void  display_deletion (const  void  *that) ; 

static  int  _KRULE_display_deletion ( const  void  *that) ; 

static  _KTYPE  *_KType () ; 

virtual  _KTYPE  *_KgetType()  const; 

virtual  _KObject  *_KexecMethod ( const  char*, int,  KObject*[]); 
int  _KBEGINUPDATE (void  *attr  ptr); 
int  _KENDUPDATE (void  *attr_ptr); 
}  / 

typedef  _KdbObject<  KcolEntity<  KCLASS  Part,   KTID  Part>,   KTID  Part> 
KINST  Part;  ~  ------ 


Figure  6-3.  C-h-  code  generated  by  the  K.3  compiler  for  the  specification 

given  in  Figure  6-2. 

For  example,  the  following  K.3  expression  on  variable  x,  defined  over  one  entity 
class: 

x.y; 

is  mapped  to  the  following  C-H-  expression: 
(x.load()).y; 


78 

There  are  some  K.3  expressions  which  do  not  have  a  counterpart  in  C++.  Examples 
of  those  are  context  expressions  and  expression  lists.  Context  expressions  are  mapped  to 
a  set  of  support  functions.  Support  functions  build  a  query  tree  and  invoke  the 
eval  contextO  method  in  the  class  KBMS  to  perform  query  evaluation  by  sending  the 
constructed  query  tree  as  a  parameter. 

Expression  lists,  as  presented  in  Chapter  4,  are  expressions  that  are  evaluated  on  an 
object  returned  by  another  expression,  called  the  left-hand  object.  All  the  names  used  in 
each  expression  are  assumed  to  be  in  the  scope  of  the  class  in  which  the  left-hand  object 
has  an  instance.  The  C++  mapping  for  this  statement  uses  a  temporary  variable.  For 
example,  the  following  expression  list: 

x()  {  y=i,  z() } 

where  "y"  and  "z"  are  defined  in  the  class  which  defines  the  return  value  of  "x",  is 
mapped  to  the  following  in  C++: 

{  temp  =  x();  temp.y=l  ;  temp.zQ;  } 
The  mappmg  of  a  rule  body  with  a  condition  clause  is  a  straightforward  translation 
to  C++  "if  statements,  except  for  the  case  of  guarded  expressions.  In  this  case,  nested 
"if  statements  are  used.  For  example,  the  following  guarded  expression: 
(X,Y|Z) 

is  mapped  to: 

if(X)  { 

if(Y)  { 

if(Z) 
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6.4  The  Compilation  Process 

6.4.1  Invoking  the  K.3  Compiler 

The  K.3  compiler  is  invoked  in  the  unix  shell  command  prompt  as  follows: 
k  <flags>  [  <file_name>  ] 
where  <flags>  are  the  compiler  flags,  and  <file_name>  is  the  name  of  the  input  file.  The 
file  name  can  be  omitted  in  some  cases,  depending  on  the  compiler  flags  that  are  used. 
Table  6-1  presents  the  flags  recognized  by  the  compiler.  Notice  that  no  file  name  needs 
to  be  specified  with  the  -,  -o  and  -QP  flags. 

6.4.2  Directory  structure 

The  K.3  compiler  exploits  the  hierarchical  directory  structure  of  Unix.  Environment 
variables  are  used  to  specify  the  path  to  the  different  directories  used  by  K.3.  In  Table  6- 
2,  we  present  the  environment  variables  used  to  specify  paths  for  K.3  directories. 

6.4.3  Steps  in  the  compilation  process 

Compilation  of  K.3  specifications  is  a  multi-pass  compilation  process.  This  process 
is  illustrated  in  Figure  6-4.  The  following  is  a  description  of  each  of  the  steps  in  this 
process: 

Preprocessing.  Preprocessing  is  done  by  the  C  preprocessor.  During  this  step,  macros 
(specified  with  the  #define  directive)  are  expanded  and  external  source  files  (specified 
with  the  #include  directive)  are  included  in  the  input. 

Parsing.  This  step  performs  lexical  scanning  (or  "tokenization"  of  the  input),  does 
syntax  checking,  and  builds  an  internal  representation  known  as  the  K  Abstract  Syntax 
Tree  (KAST).  This  common  tree  representation  is  used  in  later  stages  of  the  compilation 


80 

process.  If  any  syntax  error  occurs,  or  the  -S  flag  (syntax-only  check)  was  specified,  the 
compilation  process  is  terminated. 


Table  6-1.  Compilation  flags  recognized  by  the  K.3  compiler 


TIpSkCrinfiAii 

-V  (verbose) 

Tells  the  compiler  to  echo  messages  of  its  status.  If  omitted, 
the  comoiler  will  not  disolav  anv  messaee  exceot  error 
messages. 

-0  (output) 

Generate  a  file  called  "krun",  which  is  the  run-time  module 
that  is  executed  to  run  a  k  program. 

-QP 

Generate  a  file  called  "QP2",  which  is  the  Query  Processor 
module  that  is  executed  to  run  OQL  ad-hoc  queries. 

-MAKESCHEMA 

Generate  schema  definition  files,  i.e.  C-H-  ,h  files  orresponding 
to  the  C++  class  definition  of  K  classes,  including  programs 
(since  K  programs  are  also  classes). 

-POPULATE 

Populate  the  dictionary  and  symbol  table  files  with  the  class 
definitions  compiled. 

-BOOTSTRAP 

Used  only  when  bootstrapping  a  metaschema,  to  avoid  linking 
the  resulting  application  with  the  existing  metaschema  (this 
will  avoid  duplicate  symbol  linking  errors) 

-BIND 

Generate  bindings  for  parameterized  rules.  If  omitted,  the 
compiler  will  not  bind  anv  Darameterized  rules  tn  anv  rla<;<;<><5 

-DEBUG 

Run  the  compiler  in  "debug"  mode.  Outputs  tracing  messages 
uuiiiig  (.uuipiiduuii  ui  d  program,  userui  oniy  wnen  ueuugging 
the  K.3  compiler. 

-g 

Generate  debugging  information  in  the  object  code  generated. 
Needed  if  going  to  use  a  debugger  to  debug  K  programs. 

-S 

Check  syntax  only. 

Read  K  specification  from  standard  input  instead  of  file 
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Table  6-2.  Environment  variables  used  to  specify  paths  for  K.3  directories. 


Variable 

Description 

KPATH 

Directory  where  the  K.3  compiler  system  files  reside 

KDBPATH 

Directory  where  K.3  database  files  are  located 

KPROJDIR 

Directory  that  will  contain  generated  code  and  auxiliary  files 
used  in  semantic  checking 

Symbol-table  creation.  During  this  step,  a  new  symbol  table  is  created  from  the 
KAST.  The  symbol  table  is  used  during  semantic  checking  and  code  generation. 
Previously-defined  symbols  are  retrieved  from  a  symbol-table  file  (  KSchema.s). 


Figure  6-4.  The  compilation  process 
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Semantic  checking.  During  semantic  checking,  the  semantic  correctness  of  the  input 
is  verified.  Any  semantic  errors  are  displayed  and  will  result  in  the  termination  of  the 
compilation  process. 

Intermediate  code  generation.  C++  code  generation  of  implementation  (not 
specification)  is  done  at  this  stage,  by  making  use  of  the  KAST  and  the  symbol  table. 
This  includes  method  implementation  and  rule  bodies.  C++  code  generation  of  class 
specifications  is  done  at  a  later  stage. 

Dictionary  population.  If  the  -POPULATE  flag  is  specified,  the  dictionary  population 
phase  will  instantiate  in  the  KB  the  specifications  given  in  the  input.  New  instances  of 
metaclasses  are  created  and  metaexpressions  are  evaluated.  A  symbol  table  file 
(  KSchema.s)  is  created  to  contain  all  the  symbols  defined  so  far. 

Parameterized  rule  binding.  If  the  -BIND  flag  is  given,  parameterized  rules  are 
bound  at  this  stage.  For  every  new  instance  of  a  metaclass,  the  parameterized  rules 
defined  for  the  metaclass  are  bound  to  the  classes  specified  in  the  bind_classes  clause, 
using  the  algorithm  that  was  presented  in  Chapter  5.  For  each  class  <class_name>  to 
which  rules  are  bound,  a  file  called  _K<class_name>BR.k  is  generated.  The  files  are  then 
fed  back  into  the  compiler,  which  compiles  the  rules  and  stores  them  in  the  KB. 

Schema  code  generation.  This  stage  generates  C++  code  with  class  specification,  if 
the  -MAKESCHEMA  flag  was  specified.  For  each  class  with  name  <class_name> 
defined,  two  files,  _K<class_name>.h  and  _K<class_name>.c.h.  The  first  file  contains  the 
C++  class  specification.  The  second  file  contains  the  implementation  of  system-defined 
methods  such  as  "create",  "del",  etc.,  as  well  as  the  _KBEGIN_x  and  _KEND_x  methods 
used  for  rule  triggers.  A  file  called  KDEFS.h,  which  includes  all  the  generated  header 
files,  is  generated  together  with  the  file  KDECLS  h  which  has  forward  declaration  of 
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internal  constants  and  types.  Another  file  called  templnst.c  is  generated,  containing 
declarations  of  template  instantiations.  A  file  KIMPL.c  includes  are  implementation  files. 
A  Makefile  is  generated,  to  be  used  in  the  machine  code  generation  phase. 

Machine  code  generation.  In  this  stage,  the  executable  code  is  generated  by  invoking 
the  g++  compiler.  This  happens  if  the  -o  or  -QP  flags  were  specified.  A  generated 
Makefile  is  used  for  this  purpose.  Two  executable  files  may  be  generated.  If  the  -o  flag 
was  used,  a  file  called  "krun"  (run-time  module)  is  generated.  If  the  -QP  flag  was  used, 
a  file  called  "QP2"  (ad-hoc  query  processing  module)  is  generated. 

6.5  Problems  Encountered 

As  usual,  the  implementation  of  K.3  was  not  free  of  problems.  The  main  source  of 
trouble  was  the  use  of  templates.  Each  template  instantiation  results  in  the  generation  of 
additional  machine  code,  one  piece  of  code  for  each  instantiated  template.  The  resulting 
executables  of  K.3  range  from  8  to  12  megabytes  ! 

The  K.3  compiler  was  originally  implemented  in  the  Sun4/SunOS4  platform.  We 
needed  to  move  to  the  new  CISE  domain,  which  runs  on  the  Sun4/Solaris2  platform.  This 
forced  us  to  migrate  to  a  new  version  of  g++  compiler,  g++  2.7.2.  This  version,  however, 
has  bugs  in  the  template  instantiations,  for  which  we  needed  to  use  some  tricks  such  as 
including  all  implementations  into  a  single  file  (KIMPL.c).  Also,  the  exception-handling 
statements  "try"  and  "catch"  are  not  well  supported  inside  programs  that  instantiate 
templates,  causing  a  run-time  error  in  the  g++  compiler  itself  We  needed  this  feature  to 
support  nested  begin_trans...end_trans  statements.  A  bug  report  was  sent  to  GNU,  but  no 
response  was  received  at  the  time  of  this  writmg. 
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To  make  template  names  unique,  templates  are  formed  by  a  concatenation  of  the 
template  name,  the  K.3  class  name  and  its  class-id.  The  class-id  is  obtained  from  the  KB, 
and  it  changes  when  the  metaschema  is  changed,  e.g.,  a  new  metaclass  is  defined.  As  a 
result,  after  a  bootstrapping,  template  instantiations  generated  in  previous  compilations 
become  outdated,  resulting  in  link  errors.  Therefore,  parts  of  the  system  need  to  be 
recompiled  to  generate  the  new  template  instantiations. 

Templates  were  a  nice  way  to  achieve  a  rapid  implementation,  but  they  are  not  strictly 
necessary.  Since  templates  are  the  heart  of  system  implementation,  a  redesign  of  the 
whole  system  would  be  needed  to  get  rid  of  them. 

Another  problem  is  associated  with  compilation-time  expression  evaluation.  The 
Dynamic  Expression  Evaluator  (DEE)  implemented  in  the  KBMS  makes  use  of  type 
information  generated  by  the  K.3  compiler  in  C++.  Therefore,  expressions  that  use  types 
defined  in  the  same  compilation  cannot  be  evaluated  at  compilation  time.  This  posed  a 
problem  in  some  model  extensions,  for  which  a  string  had  to  be  used  instead  of  the  actual 
type  name. 


CHAPTER  7 
PROOF  OF  CONCEPT 


In  this  chapter,  we  present  the  model  extensions  and  application  programs  that  we 
defined  and  developed  to  prove  our  concept.  We  were  successful  in  the  use  of  K.3  to 
extend  the  model  with  new  class,  association  and  constraint  types,  and  to  define  other 
class  and  association  properties  such  as  indices.  We  also  implemented  a  Workflow 
Management  Support  System  which  makes  use  of  model  extensions  such  as  the  Activity 
and  Process  class  types  and  control  associations  like  Sequential,  Parallel,  etc.  for  defining 
process  models. 

7.1  Association-Type  Extensibility 

An  association  type  defines  the  structural  properties  and  operational  semantics  of  a 
set  of  like  associations.  Structural  properties  are  inherent  in  the  kernel  model: 
structurally,  an  association  is  a  set  of  association  links  which  connect  an  instance  in  the 
defining  class  with  instances  in  the  constituent  classes.  Operational  semantics  are  distinct 
among  different  association  types,  and  are  represented  by  ECAA  rules. 

The  metaclass  Assoc  is  the  base  class  for  all  association  types.  It  defines  the  common 
structural  properties  of  association  types,  e.g.,  a  defining  class,  a  set  of  association  links 
each  of  which  has  a  constituent  class,  etc.  An  association  type  is  defined  by  a  metaclass 
derived  from  class  Assoc.  Examples  of  this  are  the  kernel  association  types  Generalization 
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and  Aggregation,  represented  by  metaclasses  that  bear  the  same  names.  The  operational 
semantics  of  an  association  type  can  be  defined  by  means  of  parameterized  rules. 

A  set  of  supporting  methods  are  needed  for  processing  parameterized  rules.  They  are 
used  either  to  substitute  parameters  or  to  evaluate  a  condition  in  the  "bind_if '  clause  in 
parameterized  rules.  These  methods  were  defined  in  the  classes  Association  and 
AssocLink.  The  following  is  a  brief  description  of  the  methods: 

definingClass():  Returns  the  name  of  the  defining  class  of  an  association. 

constClassName():  Returns  the  name  of  the  constituent  class  of  an  association  link. 

lmkNames():  Returns  a  comma-delimited  list  of  link  names  of  an  association. 

constituentClasses():  Returns  a  comma-delimited  list  of  constituent  class  names  of  an 
association. 

A  list  of  the  association  types  which  have  been  defined  is  presented  in  Table  7-1. 
Most  of  them  were  defined  to  support  workflow  management.  They  will  be  discussed 
later  in  this  chapter  when  we  talk  about  the  implementation  of  a  Workflow  Management 
Support  System.  An  example  that  uses  the  association  type  Interaction  was  given  in 
Chapter  4. 

7.2  Constraint-Type  Extensibility 

A  constraint  type  defines  the  properties  of  a  constraint.  Constraints  are  used  to  enforce 
the  security  and  integrity  of  a  database.  Typically,  DBMS  supports  a  predefined  set  of 
constraints.  However,  it  would  be  nice  if  the  set  of  constraint  types  can  be  extensible  so 
that  new  constraint  types  can  be  defined  and  used  in  the  system  without  having  to  change 
the  implementation  of  the  DBMS  or  the  specification  language.  This  was  the  goal  of  our 
set  of  experiments  on  extensible  constraint  types. 
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Table  7-1:  Association  types  defined 


Association 
type 

Application 
Domain 

Semantics 

Interaction 

General 

Referential  constraint 

Sequential 

Workflow 

Instance  (activity)  in  constituent  (activity)  class 
is  initiated  after  the  instance  (activity)  in  the 
defining  (activity)  class  is  completed 

Parallel 

Workflow 

Instances  (activities)  in  the  constituent  (activity) 
classes  are  initiated  in  parallel  after  the  instance 
(activity)  in  the  defining  (activity)  class  is 
completed. 

Testing 

Workflow 

One  of  two  possible  instances  (activities)  in  the 
constituent  (activity)  classes  is  initiated 
conditionally  after  the  instance  (activity)  in  the 
defining  (activity)  class  is  completed. 

Case 

Workflow 

One  of  many  possible  instances  (activities)  in 
the  constituent  (activity)  classes  is  initiated 
conditionally  after  the  instance  (activity)  in  the 
defining  (activity)  class  is  completed. 

Decomposition 

Workflow 

The  constituent  class  is  a  process  model  which 
represents  the  details  of  an  activity.  The 
defining  class  is  an  activity  which  abstracts  the 
detailed  process. 

The  metaclass  Constraint  was  defined  for  this  purpose.  Any  constraint  defined  in  the 
system  will  be  represented  by  an  instance  in  this  class.  Different  constraint  types  are 
defined  as  metaclasses  derived  from  this  class.  Given  constraint  type  X,  X  is  a  metaclass 
derived  from  Constraint,  and  any   constraint  of  type  X  defined  in  the  system  will  be 
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represented  by  an  instance  in  both  X  and  Constraint.  The  definition  of  this  model 
extension  in  the  metaschema  is  shown  in  Figure  7-1. 


constraintDefClass  I 

 -^A  Constraint 


/  constraintDetLinks  :  SET 
constraints:  SET 


Cass 


A 


constituentClass 


Assoc 


Fugure  7.1.  Model  extensions  to  support  constraint  types 


We  have  identified  two  constraint  categories:  (i)  constraints  on  a  class,  and  (ii) 
constraints  on  a  set  of  (one  or  more)  association  links.  Constraints  on  a  class  are 
applicable  to  all  the  instances  of  that  class.  For  example,  a  MAX(n)  constraint  can  be 
defined  on  class  X  to  set  the  maximum  number  of  instances  of  X  to  "n".  Constraints  on 
a  set  of  association  links  are  applicable  to  the  associations  (or  "values")  represented  by 
the  association  links.  For  example,  a  RANGE(x,y)  constraint  on  an  Aggregation 
association  called  "score"  may  be  used  to  define  "x"  and  "y"  as  the  minimum  and 
maximum  values  of  the  attribute  "score"  The  two  constraint  categories  were  defined  by 
modifying  the  specification  of  the  metaclasses  Class  and  AssocLink,  as  shown  in  Figure 
7-1.  An  attribute  called  "constraints",  defined  as  a  set  of  Constraint  instances,  was  defined 
on  both  Class  and  AssocLink  to  represent  the  fact  that  a  class  and  an  association  link  may 
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have  a  set  of  constraints.  The  inverse  associations  "constraintDefClass"  and 
"constraintDefLinks",  defined  in  the  metaclass  Constraint,  specify  that  a  constraint  is 
defined  for  either  a  class  or  a  set  of  association  links. 

A  set  of  supporting  methods  were  defined  in  the  metaclasses  AssocLink,  Class  and 
Constraint.  These  methods  can  be  used  as  parameters  in  parameterized  rules.  The 
following  is  a  brief  explanation  of  each  supporting  method. 

defineConstraint(x).  Adds  constraint  <x>  to  the  set  of  constraints.  Defined  in  the 
metaclasses  Class  and  AssocLink,  to  define  a  constraint  on  a  class  or  a  set  of  association 
links,  respectively. 

linkNamesO.  Return  a  comma-delimited  list  of  link  names  on  which  a  constraint  is 
defined. 

definingClass().  Return  the  name  of  the  defining  class  of  the  association  links  that  are 
subject  to  the  constraint. 

constituentClasses().  Return  a  comma-delimited  list  of  the  names  of  the  constituent 
classes  of  the  association  links  that  are  subject  to  the  constraint. 

subCIasses().  Return  a  comma-delimited  list  of  subclass  names,  if  the  association 
type  of  the  links  is  Generalization. 

constCIassName(x).  Return  the  name  of  the  constituent  class  name  of  the  association 
link  <x>. 

assocLinkNaine(x).  Return  the  name  of  the  association  link  <x>. 

In  our  approach,  constraints  are  enforced  by  rules  which  are  triggered  by  the  events 
that  may  violate  them.  A  boolean  expression  is  evaluated  to  verify  that  the  constraint  is 
not  violated.  Otherwise  the  transaction  is  aborted.  The  parameterized  rules,  which  we  have 
defined  for  the  different  supported   constraint  types,  follow  this  paradigm.  Table  7-2 
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presents  the  constraint  types  that  have  been  defined.  Each  constraint  has  an  associated 
macro  which  is  used  as  a  keyword  in  the  specifications.  The  semantics  of  each  constraint 
are  expressed  in  Enghsh.  They  were  defined  by  parameterized  rules,  as  shown  in  the 
metaschema  specifications  found  in  Appendix  A. 

7.3  Class-Type  Extensibility 

Four  class  types  are  defined  in  the  kernel  model  of  K.3,  namely  Entity,  Domain, 
Aggregate  and  Schema.  A  class  type  defines  a  set  of  properties  that  are  common  to  a  set 
of  classes.  Notice  that  a  class  type  defines  the  properties  of  classes.  Classes  are  instances 
of  metaclasses.  The  model  is  extended  with  a  new  class  type  by  defining  a  metaclass 
which  defines  the  common  properties  of  the  new  type.  The  metaclass  Class  is  the  base 
class  of  all  metaclasses  that  define  class  types. 

Currently,  there  is  a  limitation  on  the  class  types  that  can  be  defined.  New  class  types 
can  only  be  defined  as  subtypes  of  the  predefined  class  types,  i.e.  they  must  extend  an 
existing  class  type.  One  reason  for  this  is  that  a  class  type,  in  addition  to  operational 
semantics,  has  also  structural  properties.  Such  properties  cannot  be  represented  by 
parameterized  rules.  Structural  properties  that  are  common  to  classes  of  the  same  type  are 
defined  by  means  of  a  base  class,  e.g.  EClassObject  for  classes  of  type  Entity  and 
DClassObject  for  classes  of  type  Domain.  However,  to  define  such  classes,  we  need  to 
assign  to  them  a  class  type,  thus  needing  to  use  any  of  the  existing  types  anyway.  This 
is  similar  to  the  "chicken  and  egg"  paradox. 

The  need  to  define  different  class  types  came  from  the  application  of  our  technology 
to  workflow  management.  This  is  going  to  be  described  in  more  detail  later  in  this 
chapter.  In  workflow  management,  a  set  of  activities  and  their  control-flow  relationships 
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and  constraints  are  modeled  to  represent  the  behavior  of  working  entities  (human  or 
automated,  acting  in  different  roles)  in  an  enterprise.  We  can  model   workflow  by 
representing  activity  types  as  classes,  and  their  control  relationships  by  control 
associations. 


Table  7-2:  Constraint  types  defined 


Constraint 

Macro 

Semantics 

Maximum  #  of 
objects 

MAXOBJ(n) 

The  maximum  number  of  instances  in  the 
object  class  cannot  exceed  <n>. 

Unique  (key) 

UNIQUE 

Attribute  has  a  unique  value  among  all  the 
instances  in  a  class. 

Fixed 

FIXED 

The  value  of  an  attribute  cannot  be  changed 
after  it  has  been  set. 

Non-null 

NON_NULL 

The  value  of  an  attribute  cannot  be  null. 

Range 

(min,max) 

iiic  Value  oi  on  duriuuic  musi  oe  oerween 
<min>  and  <max>. 

Enumeration 

ENUM(values) 

The  value  of  an  attribute  must  be  one  of  the 
values  given  in  <values>. 

Composite  key 

COMPKEY 
(keys) 

The  set  of  attributes  given  in  <keys>  form  a 
value  that  should  be  unique  among  all  the 
instances  in  class. 

Derived  value 

DERIVE(exp) 

The  value  of  an  attribute  must  be  derived 
from  the  expression  given  in  <exp>. 

Inverse-link 

INVERSE 
(link) 

The  association  link  given  m  <link>  is 
defined  in  the  constituent  class,  and  must 
refer  to  the  instance  in  the  defining  class  that 
is  associated  with  the  instance  in  the 
constituent  class. 
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Table  7-2--continued. 


Contraint 

Macro 

Semantics 

Set  equality 

SETE(a,b) 

If  an  object  has  an  instance  in  class  <a>, 
then  It  should  also  have  an  instance  in  class 

Set  exclusion 

SETX(a,b) 

If  an  object  has  an  instance  in  class  <a>, 

thf^n  it  rannnt  havp  an  in^tanpp  in  rlflS^  "Cn^ 

llivll    11    It/ClililV^V    llCiVV    CUl    lllOlCUiVw    111    VlMi^i^  J 

and  vice  versa. 

Set-subset 

SETS(a,b) 

If  an  object  has  an  instance  in  class  <a>, 
then  It  should  have  an  instance  in  class  <b>. 

Cardinality 

CARDINALITY 
(min,max) 

An  instance  in  the  defining  class  must  be 
associated  with  a  minimum  of  <min>  and  a 
maximum  of  <max>  instances  in  the 
constituent  class. 

Total 
Specialization 

TS 

An  object  that  has  an  instance  in  a  superclass 
Specialization  must  be  an  instance  in  one  of 
its  subclasses. 

The  class  type  Activity  defines  activity  classes.  Activity  classes  are  an  extension  to 
entity  classes  (e.g.,  they  have  oid)  with  additional  structural  and  operational  semantics. 
All  activity  classes  have  a  qualification  attribute  that  identifies  them  as  MANUAL  or 
AUTOMATED.  The  structural  semantics  of  activity  instances  (i.e.,  instances  of  activity 
classes)  include  (i)  a  set  of  time  stamps,  used  to  record  different  times  the  state  of  an 
activity  changes,  including  initialization  time,  start  time,  and  completion  time,  and  (ii) 
the  status  of  the  activity,  either  INACTIVE  (or  pending  to  be  started),  ACTIVE  or 
COMPLETED.  Operational  semantics  include  (i)  the  update  of  the  corresponding  time 
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stamps  every  time  the  activity  status  is  changed,  and  (ii)  the  checking  of 
preconditions/postconditions  before/after  the  execution  of  the  activity. 

We  have  defined  the  metaclass  Activity  and  the  base  class  AClassObject,  derived 
from  the  metaclasses  Entity  and  EClassObject,  respectively.  The  metaclass  Activity 
defines  the  class  type  Activity,  and  AClassObject  is  the  base  class  for  all  activity  classes. 
This  is  specified  by  means  of  the  special  attribute  "base  class"  in  the  metaclass  Activity. 
The  specification  of  these  classes  is  given  in  the  Appendix. 

The  class  types  that  we  have  defined  are  presented  in  Table  7-3.  For  each  class  type, 
the  table  gives  the  application  domain  where  it  is  commonly  used  (or  "General"  if  it  is 
generally  applicable),  the  name  of  the  base  class,  if  any,  and  their  relevant  properties  and 
semantics. 


Table  7-3;  Class  types  defined 


Class  type 

App.  domain 

Base  class 

Properties  +  semantics 

Activity 

Workflow 

AClassObj 

OID,  Timestamps,  status 
Timestamps  updated  when 
status  changes 

Process 

Workflow 

PClassObj 

OID,  status,  start  activity 
Start  activity  initiated  when 
the  process  is  started. 

ProcessDef 

Workflow 

(none) 

A  schema  consisting  of 
classes  of  type  Activity  and 
control  associations,  a 
starting  activity  and  a  set  of 
completion  activities. 

CodeBlock 

Prototyping 

(none) 

Source  code 

Used  to  define  a  piece  of 
code  in  a  method 
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7.4  Index  Specification  as  a  Model  Extension 

Virtually  all  DBMSs  use  indices  to  improve  the  performance  of  queries.  Indices  are 
defined  over  a  set  of  attributes,  which  are  known  as  the  index  key.  We  believe  that  the 
definition  of  indices  should  be  part  of  the  class  specification,  because  (i)  it  makes  the 
class  specification  more  complete  by  telling  the  reader  of  the  specification  which  indices 
are  defined  for  the  class,  (ii)  it  formalizes  the  specification  of  indices,  allowing  system 
components  such  as  a  query  optimizer  to  obtain  information  on  indices  from  the  class 
specification,  and  (iii)  some  semantics  of  indices  can  be  captured  in  the  model  itself 

We  have  extended  the  model  to  include  index  definition  in  the  class  specification.  In 
our  approach,  we  extended  the  underlying  KBMS  with  index  support  by  mapping  the 
new  index  functions  to  the  index  functions  provided  by  the  underlying  storage  manager 
[Tam95].  We  then  extended  the  model  to  represent  indices  as  objects  so  that  index 
operations  (insert,  update,  delete)  are  performed  by  sending  messages  to  index  objects. 
Such  operations  are  mapped  to  function  calls  to  the  KBMS'  index  library. 

In  our  model,  we  define  an  index  as  a  holder  of  instanceivalue  pairs  which  has 
fast-lookup  functions  to  retrieve  references  to  instances  given  a  value  or  a  range  of  values. 
We  have  defined  the  following  semantics  of  indices,  where  "I"  is  an  index  on  attribute 
"x",  and  "n"  is  an  instance  of  the  class  for  which  the  index  is  defined: 

1)  When  a  new  instance  n  is  created,  add  the  instance:value  pair  n:x  to  I. 

2)  When  an  instance  is  deleted,  remove  the  instance:value  pair  n:x  into  I. 

3)  Before  the  value  of  x  is  updated,  remove  the  instance:value  pair  n:x  from  I. 

4)  After  the  value  of  x  is  updated,  insert  the  instance:value  pair  n:x  from  I. 
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Notice  that  the  above  semantics  can  be  captured  by  parameterized  rules  which  are 
bound  to  the  class  for  which  the  index  is  defined.  We  defined  the  Index  metaclass  for  this 
purpose,  as  shown  in  Figure  7-2.  The  properties  of  indices  include  (i)  a  unique  mdex 
identifier  ("index  id")  assigned  by  the  underlying  storage  manager,  (ii)  an  index  number 
("mdex  no"),  which  identifies  the  index  within  the  entity  class,  (iii)  the  attribute  on 
which  the  index  is  defined  ("index_attr"),  (iv)  the  entity  class  for  which  the  index  is 
defined  ("index_eclass"),  and  (v)  a  unique  system-assigned  index  name  ("index  name"). 
A  set  of  methods  were  defined  to  support  index  initialization  and  update.  These  methods 
are  invoked  at  run-time  by  the  parameterized  rules  that  are  shown  in  the  figure,  which  are 
bound  to  the  class  for  which  an  index  is  defined.  The  method  "indexed  class"  is  a 
supporting  method  used  in  the  "bind  classes"  clause  of  the  parameterized  rules  to  indicate 
that  the  rules  are  to  be  bound  to  the  class  for  which  the  index  is  defined. 

The  DEFINE_INDEX(x)  macro  is  used  in  the  "where"  clause  of  the  class 
specification  to  define  an  index  for  a  class.  This  macro  expands  to  the  following 
metaexpression; 

define_index(Index.pnew()  {  index_attr  :=  x  } 
where  "x"  is  the  name  of  the  association  link  that  represents  the  attribute  on  which  the 
index  is  defined.  An  example  of  a  class  specification  in  K.3  that  uses  this  macro  is 
presented  in  Figure  7-3. 

7.5  Test  Case:  A  Workflow  Management  Support  System 

Workflow  management  is  the  ability  to  control  the  flow  of  data  and  the  operations  carried 
out  by  application  systems  and  users  who  play  different  roles  in  a  project  to  reach 
different  enterprise  goals.  To  reach  a  goal,  users  and  application  systems  have  to  perform 
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define  Index  :  MetaClass 
associations : 
public : 

Aggregation  -> 


index_ 

index_ 

index_ 

index 

index_ 

index 


id 
no 

attr 
eclass 
name 
sitelD 


} 


Integer ; 

Integer; 

AssocLink; 

Entity; 

String; 

Integer; 


//returned  by  the  OM 

//  this  is  the  order  in  Entity 


//  local  storage  site  id 


methods : 
pAiblic : 
method 
method 

method 
method 
method 
method 
method 
method 
method 
method 
method 
method 
method 

rules : 


initialize!)    :  Integer; 
initialize ( indexName   :  Str 

sitelD  :  Integer) 
inser t_index_entry ( attr_va 
inser t_index_entr y ( attr_va 
insert_index_entr y ( attr_va 
insert_index_entry ( attr_va 
remove_index_entry ( attr_va 
remove_index_entry ( attr_va 
remove_index_entry ( attr_va 
reinove_index_entry  ( attr_va 
destroy_index ( ) ; 
indexed_class ( )  :  String; 
indexed_attr ( )    :  String; 


ing,  indexType 
Integer ; 


Integer, 


lue 
lue 
lue 
lue 
lue 
lue 
lue 
lue 


String,   oid  :   Integer) ; 
Real,   oid  :  Integer); 
Integer,   oid  :  Integer); 
Character,   oid  :  Integer); 
String,   oid  :  Integer) ; 
Real,   oid  :  Integer); 
Integer,   oid  :  Integer) ; 
Character,  oid  :  Integer); 


rule  delayed_init  is 
triggered  on_coninit  create  () 
condition  index_eclass   !=  null 
action 

index_siteID  :=  index  eclass . sitelD; 
index_id  :=  initialize  () ; 

end; 


param_rule  index  insert 
bind_classes  @in3exed_class ( ) 

triggered  after  create ( ) ,   update ( @indexed_attr ( ) ) 
action 

@indexed_class ( ) .insert  index_entry ( @index_no, oid ( ) , 
@indexed|_attr  ( )  )  ; 

end; 

param_rule  index  remove 
bind_classes  @in3exed_class ( ) 

triggered  before  dellT  ,   update ( @indexed_attr () ) 
action 

@indexed_class ( ) .remove  index_entry ( @index_no, oid { ) , 
@indexed"_attr  ( )  ) ; 

end; 

end  Index;  

Figure  7-2.  Specification  of  the  metaclass  Index  in  K.3 


different  activities  which  involve  the  coordination  of  individual  tasks  as  well  as  the 
interchange  of    large  amounts  of  information.  A  process  definition  [Hol94]  is  an 
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define  Part  :  Entity  in  ManufSchema  is 
associations : 
public: 

Generalization 

{Board, IntegratedCircuit , Resistor , Capacitor  } ; 
Aggregation  -> 
{ 

part_no  :  String; 
description  :  Text; 
avg_cost  :  Dollar; 
qty_on_hand  :  Integer; 
part_value  :  Real; 
no_of_pins  :  Integer; 

}; 

where : 

DEFINE_INDEX(part_value) ; 
end  Part; 


Figure  7-3.  Example  of  a  class  specification  that  uses  the  DEFINE  INDEX  macro 


abstraction  used  to  specify  (i)  the  activities  that  need  to  be  performed  to  achieve  a  goal, 
(ii)  their  control  relationships,  (iii)  the  users  who  play  different  roles  and  perform 
different  work  items  to  complete  the  activities,  and  (iv)  the  tools  or  application  systems 
needed  for  carrying  out  a  project.  Furthermore,  a  process  definition  is  a  formalization 
which  can  be  translated  into  a  control  software  program.  A  Workflow  Management 
System  (WFMS)  is  an  automated  system  that,  driven  by  a  process  definition,  controls 
and  constrains  the  control  flow  by  (i)  allowmg  or  disallowing  certain  activities  to  be 
performed  based  on  a  set  of  data  conditions  and  rules,  (ii)  allowing  or  disallowing  users 
to  undertake  different  work  items  in  activity  instance  lists,  (iii)  performing  automated 
and/or  manual  support  activities,  and  (iv)  recording  metrics  which  are  used  for  evaluating 
the  functionality  and  performance  of  a  process. 

The  following  is  a  description  of  the  modeling  constructs  used  in  a  process  definition: 
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Process.  A  process  is  a  set  of  process  activities  that  are  connected  in  order  to  achieve 
a  common  goal.  Such  activities  may  consist  of  manual  activities  or  workflow  (or 
automated)  activities. 

Participant.  Participants  in  a  process  are  those  persons  or  entities  that  participate  in 
a  process  by  performmg  a  set  of  work  items  known  as  a  work  list. 

Process  Role.  A  process  role  associates  participants  with  a  collection  of  activities  in 
a  process.  A  participant  may  have  many  process  roles  in  different  processes. 

Process  Activity.  An  activity  is  a  logical  step  or  description  of  a  piece  of  work  that 
contributes  toward  the  achievement  of  a  process.  A  process  activity  may  include  a 
manual  activity  and/or  an  automated  workflow  activity. 

Work  Item.  A  work  item  is  the  most  basic  unit  of  work.  It  represents  work  to  be 
processed  in  the  context  of  an  activity.  Work  items  are  performed  by  participants. 

A  high-level  object-oriented  semantic  model  can  be  used  to  capture  the  semantics  of 
a  process  definition.  We  have  done  model  extensions  to  incorporate  (i)  process  activity 
classes,  used  to  represent  the  semantics  and  metrics  of  process  activities,  (ii)  process 
classes,  which  capture  the  semantics  of  processes,  (iii)  control  associations,  to  model  the 
control-flow  relationships  of  activities,  (iv)  control  constraints,  which  limit  the  flow  of 
control  based  on  a  set  of  data  conditions,  (v)  process  definition  classes,  which  capture 
the  semantics  of  process  definitions,  and  (vi)  administration  classes  used  to  represent 
administrative  components  of  workflow  such  as  users  and  roles.  An  example  of  a  process 
definition  using  these  constructs  is  given  in  Figure  7-4.  A  set  of  activity  classes 
(represented  by  rectangles)  are  defined,  each  of  which  has  a  set  of  control  associations 
like  Parallel  (P),  Sequential  (S)  and  Synchronization  (Y). 
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Project:  PrepareProduction 


Initiate 
Production 
Cycle 


Analyze 
Market 


Prepare 

yV 

BiU  of 

Materials 

Analyze 
Product 
Design 


Prepare 
Production 
Schedule 


Prepare 

Update 

Purchase 

P, 

AP 

Orders 

Files 

Hire 

Personell 

Receive 
Materials 


Initiate 
Production 


Project  data: 

PurchaseOrder  MaketData 
ProductDesign  BillOfMaterials 
ProductionSchedule  PlantSetup 
Employee  Materials  ProductionPlan 

Figure  7-4.  Example  of  a  process  definition 


We  designed  and  implemented  in  K.3  a  Workflow  Management  Support  System 
(WFMSS)  which,  driven  by  a  process  definition  defined  by  extended  modeling  constructs, 
supports  a  Workflow  Management  System  (WFMS).  The  use  of  K.3  for  implementing  a 
WFMS  has  the  following  advantages.  First,  the  semantics  of  workflow  constructs  can  be 
defined  by  extending  the  underlying  model  using  the  high-level  abstractions  of  K.3. 
Second,  parameterized  rules  can  be  used  to  capture  the  semantics  that  are  general  to  the 
workflow  application  domain.  Third,  explicit  rules  can  be  used  to  capture  semantics  that 
are  specific  to  a  given  process  model.  Fourth,  once  the  model  is  extended  to  support 
workflow,  the  resulting  abstractions  (e  g,  workflow  classes  and  control  associations)  can 
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be  used  to  define  in  a  high-level  fashion  the  process  models  of  a  workflow,  which  are 
compiled  to  generate  a  WFMSS  that  captures  the  semantics  of  the  process  models. 

The  architecture  of  the  system  is  illustrated  in  Figure  7-5.  The  WFMSS  acts  as  a 
server  to  a  Workflow  Engine  (WFE).  The  WFE  is  a  multi-threaded  process  which 
interacts  with  the  WFMSS  by  means  of  a  Workflow  Manipulation  Language  (WFML). 
The  WFML  includes  commands  for  the  creation/termination  of  activities  and  processes, 
and  the  retrieval  of  information  about  the  status  and  metrics  of  activities  and  projects. 
WFML  is  based  on  the  OQL  query  language  [Ala89].  The  WFMSS  uses  the  query 
processing  module  of  the  KBMS  to  process  WFML  commands.  The  WFE  controls  a  set 
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of  supporting  applications  which  may  also  update  the  data  in  the  KB.  A  set  of  Workflow 
Desktops  (WFDs)  serve  as  user  interfaces  and  interact  with  the  WFE  to  execute  and 
retrieve  information  about  workflow  enactments. 

Figure  7-6  presents  the  metaschema  with  the  extensions  made  to  support  workflow 
management.  Three  new  class  types  were  defined,  namely  Activity,  Process  and 
ProcessDef  Activity  classes  define  the  properties  of  activities,  while  Process  classes 
define  the  properties  of  workflow  projects.  A  class  of  type  ProcessDef  represents  a 
process  definition,  which  defines  a  set  of  interrelated  activities  to  capture  the  semantics 
of  the  workflow  of  a  project.  Process  definitions  are  represented  by  the  metaclass 
ProcessDef,  defined  as  a  subclass  of  the  metaclass  Schema.  A  process  definition  has,  in 
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addition  to  the  properties  of  a  normal  schema,  other  attributes  such  as  the  staning  activity 
(i.e.,  the  activity  that  is  initiated  first  when  a  project  is  started)  and  a  set  of  completion 
activities  (i.e.,  the  activities,  once  completed,  mark  the  completion  of  a  process). 

Control  associations  are  used  to  capture  the  semantics  of  control  flow  (or  "routing"). 
They  define  the  execution  relationship  between  the  defining  activity  (i.e.  the  defining 
class  of  type  Activity)  and  the  constituent  activities.  We  defined  different  control 
association  types,  such  as  (i)  Sequential,  which  states  that  the  constituent  activity  is 
initiated  when  the  defining  activity  is  completed,  (ii)  Parallel,  which  states  that  the 
constituent  activities  are  initiated  in  parallel  when  the  defining  activity  is  completed,  (iii) 
Testing,  which  conditionally  branches  to  one  of  two  possible  execution  paths,  and  (iv) 
Case,  which  conditionally  branches  to  one  of  many  (two  or  more)  possible  execution 
paths.  New  constraint  types  were  defined  to  capture  the  semantics  of  conditional 
workflow  (used  by  the  Testing  and  Case  association  types)  and  conditional  initiation  of 
activities  like  preconditions  (which  must  be  satisfied  before  an  activity  is  started)  and 
postconditions  (which  must  be  satisfied  before  an  activity  can  be  completed).  The 
Decomposition  association  type  is  used  to  define  the  process  definition  of  an  activity.  In 
an  association  of  this  type,  the  defining  class  is  an  activity  for  which  a  detailed  process 
definition  is  defined  in  the  constituent  class.  Many  of  these  association  types  were 
defined  by  parameterized  rules.  The  specifications  of  these  classes  are  given  in  the 
Appendix. 

Given  the  extended  model,  a  KBA  known  as  the  Workflow  Modeler,  defines  in  K.3 
a  process  definition  using  the  extended  modeling  constructs.  The  specification  of  the 
process  definition  is  compiled,  and  the  resulting  system  is  the  Workflow  Management 
Support  System.  Notice  that  higher-level  specifications  result  from  the  use  of  a 
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semantically-rich  object  model  instead  of  "burying"  the  intended  semantics  inside  method 
implementation. 

Workflow  Management  is  still  an  area  of  intensive  research.  In  this  section,  we  have 
just  given  an  overview  of  it  to  illustrate  the  use  of  the  extensible  K.3  to  model  and 
implement  a  complex  system  such  as  a  Workflow  Management  System.  More  details  can 
be  found  in  the  work  by  Kunnisetty  [Kun96],  and  the  Workflow  Management  Coalition 
[Hol94]. 


CHAPTER  8 

CONCLUSIONS  AND  FUTURE  RESEARCH  DIRECTIONS 

8.1  Conclusions 

Object-oriented  DBMSs  and  DBPLs  suffer  from  the  problem  that  their  underlying 
object  models  are  fixed  and  too  simple.  As  a  result,  a  lot  of  application  semantics  have 
to  be  implemented  m  methods.  This  is  difficult  for  application  systems  developers,  and 
the  implemented  semantic  properties  and  constraints  are  difficult,  if  not  impossible,  to  be 
understood  by  others  smce  they  are  buried  in  the  implemented  code.  In  this  dissertation, 
we  have  presented  the  design  and  implementation  of  an  extensible  object-oriented 
knowledge  base  programming  language,  K.3.  The  underlying  model  of  K.3,  OSAM*/X, 
is  a  model  that  can  be  extended  and  customized  to  satisfy  the  modeling  requirements 
of  a  diverse  range  of  application  domains.  It  allows  the  use  of  higher-level  abstractions 
for  more  complete  and  explicit  specifications  of  schemas.  K.3  provides  the  facilities  for 
the  definition  of  model  extensions  (metaschemas)  and  application  schemas.  When  the 
underlying  model  is  extended  by  the  knowledge-base  customizer  (KBC),  the  language 
automatically  reflects  the  extensions  without  having  to  rewrite  or  modify  the  compiler. 
The  keywords  and  macros  defined  by  the  KBC  are  used  by  knowledge-base 
administrators  (KBAs)  and  application  developers  to  specify  the  semantics  of  the 
conceptual  schemas  of  databases  and  application  schemas  defined  in  application  programs. 
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respectively.  The  resulting  specifications  in  K.3  are  more  complete  and  easier  to 
understand. 

The  three  main  contributions  of  this  work  can  be  summarized  as  follows: 
1)  An  extensible  knowledge  base  programming  language  serving  as  a  universal 
language  for  the  specification  and  implementation  of  model  extensions,  conceptual 
schemas  and  application  schemas.  2)  An  extensible  object-oriented  semantic  model  which 
can  be  extended  to  satisfy  the  modeling  requirements  of  diverse  application  domains.  3) 
An  extensible  knowledge  base  management  system  (KBMS)  which  can  be  tailored  to 
satisfy  the  requirements  of  diverse  application  domains  by  extending  its  underlying  object 
model. 

In  our  research,  we  have  identified  some  limitations  of  our  approach,  as  discussed 
below. 

8.1.1  Limitations  in  model  extensibility 

We  have  defined  model  extensions  to  (i)  represent  both  well-known  and  special 
constraint  types,  (ii)  incorporate  index  definitions  as  class  properties  and  define  indices 
in  the  class  specification  to  enhance  the  expressiveness  of  the  specification  language, 
(iii)  define  new  types  of  associations  such  as  control  associations  used  in  work  flow 
models,  and  (iv)  define  work  flow  models  to  support  a  Work  Flow  Management  System. 

Our  approach  uses  procedural  semantics  as  the  paradigm  for  defining  model 
extensions  without  changing  the  system  implementation.  However,  some  model 
extensions  may  have  structural  properties  which  require  a  change  in  the  underlying 
KBMS.  For  example,  a  new  class  type  with  different  properties  than  those  of  the  kernel 
model  and  which  thus  cannot  be  defined  by  subclassing  from  the  existing  class  types. 
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This  cannot  be  represented  by  procedural  semantics,  but  rather  by  changing  the  kernel 
model  and  the  underlying  KBMS  to  support  it. 
8.1.2  Limitations  in  language  extensibility 

Our  language  is  extensible  with  respect  to  class  types,  association  types,  constraint 
types,  and  extensible  class,  association  and  method  properties.  In  our  approach  to 
language  extensibility,  the  aspects  of  the  model  that  are  extensible  have  to  be  known  a 
priori.  For  example,  we  did  not  consider  rules  as  extensible.  Therefore  there  is  no 
language  construct  that  allows  the  incorporation  of  extensions  to  rules  in  the  language 
without  modifying  the  compiler. 

One  problem  associated  with  language  extensibility  is  that  although  many  extended 
languages  may  have  the  same  semantics,  they  may  differ  in  their  syntax  (e.g.,  constraint 
keywords  and  association-type  names).  Unless  a  standard  terminology  is  chosen  (which 
must  include  a  standard  set  of  keywords),  the  use  of  K. 3  as  a  specification  language  for 
both  communication  and  specification  purposes  may  be  difficult. 

8.2  Future  Research  Directions 

The  concept  and  techniques  of  extensibility  introduced  in  this  work  can  be  applied  to 
achieve  other  types  of  extensibilities,  namely  extensible  structural  properties,  system 
extensibility  and  extensible  graphical  tools. 

As  we  mentioned  before,  some  model  extensions  may  have  structural  properties  not 
supported  by  the  underlying  KBMS.  In  these  cases,  the  procedural-semantics  approach 
does  not  work.  Other  means  need  to  be  found  to  represent  such  model  extensions  and 
to  incorporate  them  into  the  KBMS  without  having  to  change  its  implementation.  This 
leads  to  the  issue  of  system  extensibility. 


107 

On  system  extensibility,  more  research  needs  to  be  done  so  that  model  extensions  can 
result  in  system  extensions.  A  possible  approach  is  to  define  the  system  components  as 
metaclasses,  implementing  just  a  "kernel"  system  in  a  lower-level  language  like  C++  or 
C,  then  implementing  the  rest  on  top  of  it  in  a  higher-level  language  like  K.3.  The 
behaviors  of  system  components  can  be  defined  by  rules,  and  their  extensions  by 
parameterized  rules. 

Graphical  data  modeling  tools  have  already  become  popular.  In  our  research  center, 
we  have  implemented  our  own  tools  called  XGTools.  If  the  underlying  model  is 
extensible,  then  XGTools  should  also  be  extensible  so  that  its  implementation  does  not 
have  to  be  changed  to  support  the  extended  model.  More  sophisticated  methods  must  be 
used,  such  as  user-defined  icons  and  language  mappings.  An  initial  attempt  was  made  by 
Gurrapu  [Gur95].  More  work  needs  to  be  done  so  that  the  extensibility  of  tools  is  more 
general.  Most  probably  graphical  tools  must  be  fully  driven  by  the  underlying  KBMS. 
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